Re: [PATCH 2/8] cxl/memfeature: Add CXL memory device patrol scrub control feature
From: Dan Williams
Date: Thu Mar 06 2025 - 20:39:35 EST
shiju.jose@ wrote:
> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>
> CXL spec 3.2 section 8.2.10.9.11.1 describes the device patrol scrub
> control feature. The device patrol scrub proactively locates and makes
> corrections to errors in regular cycle.
>
> Allow specifying the number of hours within which the patrol scrub must be
> completed, subject to minimum and maximum limits reported by the device.
> Also allow disabling scrub allowing trade-off error rates against
> performance.
>
> Add support for patrol scrub control on CXL memory devices.
> Register with the EDAC device driver, which retrieves the scrub attribute
> descriptors from EDAC scrub and exposes the sysfs scrub control attributes
> to userspace. For example, scrub control for the CXL memory device
> "cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/.
>
> Additionally, add support for region-based CXL memory patrol scrub control.
> CXL memory regions may be interleaved across one or more CXL memory
> devices. For example, region-based scrub control for "cxl_region1" is
> exposed in /sys/bus/edac/devices/cxl_region1/scrubX/.
>
> Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
> ---
> Documentation/edac/scrub.rst | 64 +++++
> drivers/cxl/Kconfig | 18 ++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/memfeatures.c | 476 +++++++++++++++++++++++++++++++++
> drivers/cxl/core/region.c | 5 +
> drivers/cxl/cxlmem.h | 10 +
> drivers/cxl/mem.c | 4 +
> 7 files changed, 578 insertions(+)
> create mode 100644 drivers/cxl/core/memfeatures.c
>
> diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst
> index daab929cdba1..788cf43188a4 100644
> --- a/Documentation/edac/scrub.rst
> +++ b/Documentation/edac/scrub.rst
> @@ -264,3 +264,67 @@ Sysfs files are documented in
> `Documentation/ABI/testing/sysfs-edac-scrub`
>
> `Documentation/ABI/testing/sysfs-edac-ecs`
> +
> +Examples
> +--------
> +
> +The usage takes the form shown in these examples:
> +
> +1. CXL memory device patrol scrubber
> +
> +1.1 Device based scrubbing
> +
> +1.1.1. Query what is device default/current scrub cycle setting.
> +
> +# cat /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration
> +
> +43200
> +
> +1.1.2. Query the range of device supported scrub cycle.
> +
> +# cat /sys/bus/edac/devices/cxl_mem0/scrub0/min_cycle_duration
> +
> +3600
> +
> +# cat /sys/bus/edac/devices/cxl_mem0/scrub0/max_cycle_duration
> +
> +918000
> +
> +1.1.3. Program scrubbing for a device to repeat every 21600 seconds (quarter of a day).
> +
> +# echo 21600 > /sys/bus/edac/devices/cxl_mem0/scrub0/current_cycle_duration
> +
> +# echo 1 > /sys/bus/edac/devices/cxl_mem0/scrub0/enable_background
Like I said before, I find documentation like the above to be pure
noise. Please drop the echo and cat demos,
Documentation/ABI/testing/sysfs-edac-scrub is sufficient.
> +
> +1.2. Region based scrubbing
Yay, actual content!
> +
> +CXL memory is exposed to memory management subsystem and ultimately userspace
> +via CXL regions. These can incorporate one or more parts of multiple CXL
> +Type 3 devices with traffic interleaved across them. The user may want to
s/Type 3/memory expander/ or "memory device", as even the specification
is getting away from "Type" language which only long time CXL
specification experts understand, not end users.
> +control the scrub rate via this more abstract region instead of having to
> +figure out the constituent devices and program them separately. The scrub
> +rate for each device covers the whole device. Thus if multiple regions use
> +parts of that device then requests for scrubbing of other regions may result
> +in a higher scrub rate than requested for this specific region.
Does the base EDAC documentation also talk about this conflict when a
device's scrub rate can be affected by aggregation objects? Or, is
aggregation a CXL specific interface phenomenon?
Questions I imagine an administrator would want the documentation to
answer are: What is the precedence if someone sets the region scrub rate
and then the device? How does changing the device scrub rate affect the
region rate? How does changing the scrub rate for a region affect the
scrub rate for other regions on the same device?
The documentation should be sufficient to write a tool such that someone
would not get surprised by the multiple-regions per-device case.
> +1.2.1 Query what is device default/current scrub cycle setting for a CXL memory region.
> +
> +# cat /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration
> +
> +86400
> +
> +1.2.2 Query the range of device supported scrub cycle for a CXL memory region.
> +
> +# cat /sys/bus/edac/devices/cxl_region0/scrub0/min_cycle_duration
> +
> +3600
> +
> +# cat /sys/bus/edac/devices/cxl_region0/scrub0/max_cycle_duration
> +
> +918000
> +
> +1.2.3 Program scrubbing for a region to repeat every 43200 seconds (half a day)
> +
> +# echo 43200 > /sys/bus/edac/devices/cxl_region0/scrub0/current_cycle_duration
More noise to trim. Replace it with a cross reference back to
Documentation/ABI/testing/sysfs-edac-scrub.
> +
> +# echo 1 > /sys/bus/edac/devices/cxl_region0/scrub0/enable_background
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 205547e5543a..b83bdb30b702 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -113,6 +113,24 @@ config CXL_FEATURES
>
> If unsure say 'n'
>
> +config CXL_RAS_FEATURES
> + bool "CXL: Memory RAS features"
The CXL subsystem enables "RAS Features" outside of this new enabling.
Just name this what it is, the CXL wrapper around the EDAC_SCRUB core =>
config CXL_EDAC_SCRUB.
> + depends on CXL_MEM
> + depends on CXL_FEATURES
> + depends on EDAC=y || (CXL_BUS=m && EDAC=m)
A shorter way of saying this I believe is "depends on EDAC >= CXL_BUS"
> + depends on EDAC_SCRUB
> + help
> + The CXL memory RAS feature control is optional and allows host to
> + control the RAS features configurations of CXL Type 3 devices.
s/RAS feature/scrub/ or "scrub + repair"
s/Type 3/memory expander/
> + It registers with the EDAC device subsystem to expose control
I don't think an end user cares that "it registers". I would say
something like
"When enabled 'cxl_mem' and 'cxl_region' EDAC devices are published with
scrub control attributes as described by
Documentation/ABI/testing/sysfs-edac-scrub"
> + attributes of CXL memory device's RAS features to the user.
> + It provides interface functions to support configuring the CXL
> + memory device's RAS features.
> + Say 'y/m' if you have an expert need to change default settings
depends on EXPERT?
> + of a memory RAS feature established by the platform/device (eg.
> + scrub rates for the patrol scrub feature). otherwise say 'n'.
> +
> config CXL_PORT
> default CXL_BUS
> tristate
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index e1d591e52d4b..2f48845b86d7 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -18,4 +18,5 @@ cxl_core-y += acpi.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> +cxl_core-$(CONFIG_CXL_RAS_FEATURES) += memfeatures.o
> cxl_core-$(CONFIG_CXL_MCE) += mce.o
> diff --git a/drivers/cxl/core/memfeatures.c b/drivers/cxl/core/memfeatures.c
> new file mode 100644
> index 000000000000..7a5c0645a07e
> --- /dev/null
> +++ b/drivers/cxl/core/memfeatures.c
Perhaps edac.c?
> @@ -0,0 +1,476 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * CXL memory RAS feature driver.
> + *
> + * Copyright (c) 2024-2025 HiSilicon Limited.
> + *
> + * - Supports functions to configure RAS features of the
> + * CXL memory devices.
> + * - Registers with the EDAC device subsystem driver to expose
> + * the features sysfs attributes to the user for configuring
> + * CXL memory RAS feature.
> + */
> +
> +#include <linux/cleanup.h>
> +#include <linux/edac.h>
> +#include <linux/limits.h>
> +#include <cxl/features.h>
> +#include <cxl.h>
> +#include <cxlmem.h>
> +#include "core.h"
> +
> +#define CXL_DEV_NUM_RAS_FEATURES 1
This name tells me nothing about its purpose.
#define CXL_NR_EDAC_DEV_FEATURE 1
...matches its usage.
> +#define CXL_DEV_HOUR_IN_SECS 3600
Just drop this and write 3600, that is actually clearer than this
define.
> +
> +#define CXL_DEV_NAME_LEN 128
See below, just use asprintf.
> +
> +static int cxl_hold_region_and_dpa(void)
> +{
> + int rc;
> +
> + rc = down_read_interruptible(&cxl_region_rwsem);
> + if (rc)
> + return rc;
> +
> + rc = down_read_interruptible(&cxl_dpa_rwsem);
> + if (rc) {
> + up_read(&cxl_region_rwsem);
> + return rc;
> + }
> +
> + return 0;
> +}
> +
> +static void cxl_release_region_and_dpa(void)
> +{
> + up_read(&cxl_dpa_rwsem);
> + up_read(&cxl_region_rwsem);
> +}
> +
> +/*
> + * CXL memory patrol scrub control functions
> + */
> +struct cxl_patrol_scrub_context {
> + u8 instance;
> + u16 get_feat_size;
> + u16 set_feat_size;
> + u8 get_version;
> + u8 set_version;
> + u16 effects;
> + struct cxl_memdev *cxlmd;
> + struct cxl_region *cxlr;
> +};
> +
> +/**
> + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data structure.
> + * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub.
> + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is changeable.
> + * @scrub_cycle_hrs: [IN] Requested patrol scrub cycle in hours.
> + * [OUT] Current patrol scrub cycle in hours.
> + * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours supported.
> + */
> +struct cxl_memdev_ps_params {
> + bool enable;
> + bool scrub_cycle_changeable;
> + u8 scrub_cycle_hrs;
> + u8 min_scrub_cycle_hrs;
> +};
> +
> +enum cxl_scrub_param {
> + CXL_PS_PARAM_ENABLE,
> + CXL_PS_PARAM_SCRUB_CYCLE,
> +};
> +
> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0)
> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK BIT(1)
> +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0)
> +#define CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8)
> +#define CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0)
> +
> +/*
> + * See CXL spec rev 3.2 @8.2.10.9.11.1 Table 8-222 Device Patrol Scrub Control
> + * Feature Readable Attributes.
> + */
> +struct cxl_memdev_ps_rd_attrs {
> + u8 scrub_cycle_cap;
> + __le16 scrub_cycle_hrs;
> + u8 scrub_flags;
> +} __packed;
> +
> +/*
> + * See CXL spec rev 3.2 @8.2.10.9.11.1 Table 8-223 Device Patrol Scrub Control
> + * Feature Writable Attributes.
> + */
> +struct cxl_memdev_ps_wr_attrs {
> + u8 scrub_cycle_hrs;
> + u8 scrub_flags;
> +} __packed;
> +
> +static int cxl_mem_ps_get_attrs(struct cxl_mailbox *cxl_mbox,
> + struct cxl_memdev_ps_params *params)
> +{
> + size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs);
> + u16 scrub_cycle_hrs;
> + size_t data_size;
> + struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) =
> + kzalloc(rd_data_size, GFP_KERNEL);
> + if (!rd_attrs)
> + return -ENOMEM;
> +
> + data_size = cxl_get_feature(cxl_mbox, &CXL_FEAT_PATROL_SCRUB_UUID,
> + CXL_GET_FEAT_SEL_CURRENT_VALUE,
> + rd_attrs, rd_data_size, 0, NULL);
> + if (!data_size)
> + return -EIO;
> +
> + params->scrub_cycle_changeable = FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK,
> + rd_attrs->scrub_cycle_cap);
> + params->enable = FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + rd_attrs->scrub_flags);
> + scrub_cycle_hrs = le16_to_cpu(rd_attrs->scrub_cycle_hrs);
> + params->scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + scrub_cycle_hrs);
> + params->min_scrub_cycle_hrs = FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK,
> + scrub_cycle_hrs);
> +
> + return 0;
> +}
> +
> +static int cxl_ps_get_attrs(struct cxl_patrol_scrub_context *cxl_ps_ctx,
> + struct cxl_memdev_ps_params *params)
> +{
> + struct cxl_mailbox *cxl_mbox;
> + struct cxl_memdev *cxlmd;
> + u16 min_scrub_cycle = 0;
> + int i, ret;
> +
> + if (cxl_ps_ctx->cxlr) {
> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
> + struct cxl_region_params *p = &cxlr->params;
> +
> + ret = cxl_hold_region_and_dpa();
> + if (ret)
> + return ret;
> + for (i = p->interleave_ways - 1; i >= 0; i--) {
Ok, the region lock is needed to hold the cxl_region_params consistent,
but why is the dpa lock needed?
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> +
> + cxlmd = cxled_to_memdev(cxled);
> + cxl_mbox = &cxlmd->cxlds->cxl_mbox;
> + ret = cxl_mem_ps_get_attrs(cxl_mbox, params);
> + if (ret)
> + return ret;
missing unlock?
I am not a fan of cxl_hold_region_and_dpa() giving the locks private
names just for this file.
You can do something like:
struct rw_semaphore *region_lock __free(cxl_unlock) = cxl_acquire(&cxl_region_rwsem);
if (!region_lock)
return -EINTR;
...where cxl_acquire does the conditional acquisition, and we do not need
to audit for more missing unlock calls. As is, this
cxl_hold_region_and_dpa() proposal injures the ability to grep for
cxl_region_rwsem and be taken directly acquisition sites.
> +
> + if (params->min_scrub_cycle_hrs > min_scrub_cycle)
> + min_scrub_cycle = params->min_scrub_cycle_hrs;
> + }
> + cxl_release_region_and_dpa();
> +
> + params->min_scrub_cycle_hrs = min_scrub_cycle;
> + return 0;
> + }
> + cxl_mbox = &cxl_ps_ctx->cxlmd->cxlds->cxl_mbox;
> +
> + return cxl_mem_ps_get_attrs(cxl_mbox, params);
> +}
> +
> +static int cxl_mem_ps_set_attrs(struct device *dev,
> + struct cxl_patrol_scrub_context *cxl_ps_ctx,
> + struct cxl_mailbox *cxl_mbox,
> + struct cxl_memdev_ps_params *params,
> + enum cxl_scrub_param param_type)
> +{
> + struct cxl_memdev_ps_wr_attrs wr_attrs;
> + struct cxl_memdev_ps_params rd_params;
> + int ret;
> +
> + ret = cxl_mem_ps_get_attrs(cxl_mbox, &rd_params);
> + if (ret) {
> + dev_dbg(dev, "Get cxlmemdev patrol scrub params failed ret=%d\n", ret);
> + return ret;
> + }
> +
> + switch (param_type) {
> + case CXL_PS_PARAM_ENABLE:
> + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + params->enable);
> + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + rd_params.scrub_cycle_hrs);
> + break;
> + case CXL_PS_PARAM_SCRUB_CYCLE:
> + if (params->scrub_cycle_hrs < rd_params.min_scrub_cycle_hrs) {
> + dev_dbg(dev, "Invalid CXL patrol scrub cycle(%d) to set\n",
> + params->scrub_cycle_hrs);
> + dev_dbg(dev, "Minimum supported CXL patrol scrub cycle in hour %d\n",
> + rd_params.min_scrub_cycle_hrs);
> + return -EINVAL;
> + }
> + wr_attrs.scrub_cycle_hrs = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + params->scrub_cycle_hrs);
> + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + rd_params.enable);
> + break;
> + }
> +
> + ret = cxl_set_feature(cxl_mbox, &CXL_FEAT_PATROL_SCRUB_UUID,
> + cxl_ps_ctx->set_version,
> + &wr_attrs, sizeof(wr_attrs),
> + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET,
> + 0, NULL);
> + if (ret) {
> + dev_dbg(dev, "CXL patrol scrub set feature failed ret=%d\n", ret);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static int cxl_ps_set_attrs(struct device *dev,
> + struct cxl_patrol_scrub_context *cxl_ps_ctx,
> + struct cxl_memdev_ps_params *params,
> + enum cxl_scrub_param param_type)
> +{
> + struct cxl_mailbox *cxl_mbox;
> + struct cxl_memdev *cxlmd;
> + int ret, i;
> +
> + if (cxl_ps_ctx->cxlr) {
> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
> + struct cxl_region_params *p = &cxlr->params;
> +
> + ret = cxl_hold_region_and_dpa();
> + if (ret)
> + return ret;
> + for (i = p->interleave_ways - 1; i >= 0; i--) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> +
> + cxlmd = cxled_to_memdev(cxled);
> + cxl_mbox = &cxlmd->cxlds->cxl_mbox;
> + ret = cxl_mem_ps_set_attrs(dev, cxl_ps_ctx, cxl_mbox,
> + params, param_type);
Similar locking problem as before it seems.
> + if (ret)
> + return ret;
This probably wants to be a "break;", but I would prefer to keep it an
early return with the unlock moved to a scope-based cleanup scheme.
> + }
> + cxl_release_region_and_dpa();
> +
> + return 0;
> + }
> + cxl_mbox = &cxl_ps_ctx->cxlmd->cxlds->cxl_mbox;
> +
> + return cxl_mem_ps_set_attrs(dev, cxl_ps_ctx, cxl_mbox,
> + params, param_type);
> +}
> +
> +static int cxl_patrol_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled)
> +{
> + struct cxl_patrol_scrub_context *ctx = drv_data;
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_ps_get_attrs(ctx, ¶ms);
> + if (ret)
> + return ret;
> +
> + *enabled = params.enable;
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable)
> +{
> + struct cxl_patrol_scrub_context *ctx = drv_data;
> + struct cxl_memdev_ps_params params = {
> + .enable = enable,
> + };
> +
> + return cxl_ps_set_attrs(dev, ctx, ¶ms, CXL_PS_PARAM_ENABLE);
> +}
> +
> +static int cxl_patrol_scrub_read_min_scrub_cycle(struct device *dev, void *drv_data,
> + u32 *min)
> +{
> + struct cxl_patrol_scrub_context *ctx = drv_data;
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_ps_get_attrs(ctx, ¶ms);
> + if (ret)
> + return ret;
> + *min = params.min_scrub_cycle_hrs * CXL_DEV_HOUR_IN_SECS;
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_read_max_scrub_cycle(struct device *dev, void *drv_data,
> + u32 *max)
> +{
> + *max = U8_MAX * CXL_DEV_HOUR_IN_SECS; /* Max set by register size */
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_read_scrub_cycle(struct device *dev, void *drv_data,
> + u32 *scrub_cycle_secs)
> +{
> + struct cxl_patrol_scrub_context *ctx = drv_data;
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_ps_get_attrs(ctx, ¶ms);
> + if (ret)
> + return ret;
> +
> + *scrub_cycle_secs = params.scrub_cycle_hrs * CXL_DEV_HOUR_IN_SECS;
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_write_scrub_cycle(struct device *dev, void *drv_data,
> + u32 scrub_cycle_secs)
> +{
> + struct cxl_patrol_scrub_context *ctx = drv_data;
> + struct cxl_memdev_ps_params params = {
> + .scrub_cycle_hrs = scrub_cycle_secs / CXL_DEV_HOUR_IN_SECS,
> + };
> +
> + return cxl_ps_set_attrs(dev, ctx, ¶ms, CXL_PS_PARAM_SCRUB_CYCLE);
> +}
> +
> +static const struct edac_scrub_ops cxl_ps_scrub_ops = {
> + .get_enabled_bg = cxl_patrol_scrub_get_enabled_bg,
> + .set_enabled_bg = cxl_patrol_scrub_set_enabled_bg,
> + .get_min_cycle = cxl_patrol_scrub_read_min_scrub_cycle,
> + .get_max_cycle = cxl_patrol_scrub_read_max_scrub_cycle,
> + .get_cycle_duration = cxl_patrol_scrub_read_scrub_cycle,
> + .set_cycle_duration = cxl_patrol_scrub_write_scrub_cycle,
> +};
> +
> +static int cxl_memdev_scrub_init(struct cxl_memdev *cxlmd,
> + struct edac_dev_feature *ras_feature, u8 scrub_inst)
> +{
> + struct cxl_patrol_scrub_context *cxl_ps_ctx;
> + struct cxl_feat_entry *feat_entry;
> +
> + feat_entry = cxl_get_feature_entry(cxlmd->cxlds, &CXL_FEAT_PATROL_SCRUB_UUID);
> + if (IS_ERR(feat_entry))
> + return -EOPNOTSUPP;
> +
> + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE))
> + return -EOPNOTSUPP;
> +
> + cxl_ps_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL);
> + if (!cxl_ps_ctx)
> + return -ENOMEM;
> +
> + *cxl_ps_ctx = (struct cxl_patrol_scrub_context) {
> + .get_feat_size = le16_to_cpu(feat_entry->get_feat_size),
> + .set_feat_size = le16_to_cpu(feat_entry->set_feat_size),
> + .get_version = feat_entry->get_feat_ver,
> + .set_version = feat_entry->set_feat_ver,
> + .effects = le16_to_cpu(feat_entry->effects),
> + .instance = scrub_inst,
> + .cxlmd = cxlmd,
> + };
> +
> + ras_feature->ft_type = RAS_FEAT_SCRUB;
> + ras_feature->instance = cxl_ps_ctx->instance;
> + ras_feature->scrub_ops = &cxl_ps_scrub_ops;
> + ras_feature->ctx = cxl_ps_ctx;
> +
> + return 0;
> +}
> +
> +static int cxl_region_scrub_init(struct cxl_region *cxlr,
> + struct edac_dev_feature *ras_feature, u8 scrub_inst)
> +{
> + struct cxl_patrol_scrub_context *cxl_ps_ctx;
> + struct cxl_region_params *p = &cxlr->params;
> + struct cxl_feat_entry *feat_entry = NULL;
> + struct cxl_memdev *cxlmd;
> + int i;
> +
> + /*
> + * The cxl_region_rwsem must be held if the code below is used in a context
> + * other than when the region is in the probe state, as shown here.
> + */
> + for (i = p->interleave_ways - 1; i >= 0; i--) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> +
> + cxlmd = cxled_to_memdev(cxled);
> + feat_entry = cxl_get_feature_entry(cxlmd->cxlds, &CXL_FEAT_PATROL_SCRUB_UUID);
> + if (IS_ERR(feat_entry))
> + return -EOPNOTSUPP;
> +
> + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEATURE_F_CHANGEABLE))
> + return -EOPNOTSUPP;
> + }
> + if (!feat_entry)
> + return -EOPNOTSUPP;
> +
> + cxl_ps_ctx = devm_kzalloc(&cxlr->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL);
> + if (!cxl_ps_ctx)
> + return -ENOMEM;
> +
> + *cxl_ps_ctx = (struct cxl_patrol_scrub_context) {
> + .get_feat_size = le16_to_cpu(feat_entry->get_feat_size),
> + .set_feat_size = le16_to_cpu(feat_entry->set_feat_size),
> + .get_version = feat_entry->get_feat_ver,
> + .set_version = feat_entry->set_feat_ver,
> + .effects = le16_to_cpu(feat_entry->effects),
> + .instance = scrub_inst,
> + .cxlr = cxlr,
> + };
> +
> + ras_feature->ft_type = RAS_FEAT_SCRUB;
> + ras_feature->instance = cxl_ps_ctx->instance;
> + ras_feature->scrub_ops = &cxl_ps_scrub_ops;
> + ras_feature->ctx = cxl_ps_ctx;
> +
> + return 0;
> +}
I didn't check the details of the above for correctness, I am mainly
reviewing this for locking and lifetime rules.
> +
> +int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd)
> +{
> + struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES];
> + char cxl_dev_name[CXL_DEV_NAME_LEN];
> + int num_ras_features = 0;
> + u8 scrub_inst = 0;
> + int rc;
> +
> + rc = cxl_memdev_scrub_init(cxlmd, &ras_features[num_ras_features],
> + scrub_inst);
> + if (rc < 0 && rc != -EOPNOTSUPP)
> + return rc;
> +
> + if (rc != -EOPNOTSUPP)
> + num_ras_features++;
> +
> + snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s",
> + "cxl", dev_name(&cxlmd->dev));
Little point in using snprintf if you're not going to check the error.
Perhaps, just do asprintf() with a __free(kfree) and skip the stack
allocation?