RE: [PATCH v2 3/8] cxl/edac: Add CXL memory device patrol scrub control feature

From: Shiju Jose
Date: Fri Mar 28 2025 - 06:19:08 EST


>-----Original Message-----
>From: Dan Williams <dan.j.williams@xxxxxxxxx>
>Sent: 27 March 2025 01:47
>To: Shiju Jose <shiju.jose@xxxxxxxxxx>; linux-cxl@xxxxxxxxxxxxxxx;
>dan.j.williams@xxxxxxxxx; dave@xxxxxxxxxxxx; Jonathan Cameron
><jonathan.cameron@xxxxxxxxxx>; dave.jiang@xxxxxxxxx;
>alison.schofield@xxxxxxxxx; vishal.l.verma@xxxxxxxxx; ira.weiny@xxxxxxxxx;
>david@xxxxxxxxxx; Vilas.Sridharan@xxxxxxx
>Cc: linux-edac@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; linux-
>mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; bp@xxxxxxxxx;
>tony.luck@xxxxxxxxx; rafael@xxxxxxxxxx; lenb@xxxxxxxxxx;
>mchehab@xxxxxxxxxx; leo.duran@xxxxxxx; Yazen.Ghannam@xxxxxxx;
>rientjes@xxxxxxxxxx; jiaqiyan@xxxxxxxxxx; Jon.Grimm@xxxxxxx;
>dave.hansen@xxxxxxxxxxxxxxx; naoya.horiguchi@xxxxxxx;
>james.morse@xxxxxxx; jthoughton@xxxxxxxxxx; somasundaram.a@xxxxxxx;
>erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx; duenwen@xxxxxxxxxx;
>gthelen@xxxxxxxxxx; wschwartz@xxxxxxxxxxxxxxxxxxx;
>dferguson@xxxxxxxxxxxxxxxxxxx; wbs@xxxxxxxxxxxxxxxxxxxxxx;
>nifan.cxl@xxxxxxxxx; tanxiaofei <tanxiaofei@xxxxxxxxxx>; Zengtao (B)
><prime.zeng@xxxxxxxxxxxxx>; Roberto Sassu <roberto.sassu@xxxxxxxxxx>;
>kangkang.shen@xxxxxxxxxxxxx; wanghuiqiang <wanghuiqiang@xxxxxxxxxx>;
>Linuxarm <linuxarm@xxxxxxxxxx>; Shiju Jose <shiju.jose@xxxxxxxxxx>
>Subject: Re: [PATCH v2 3/8] cxl/edac: Add CXL memory device patrol scrub
>control feature
>
>shiju.jose@ wrote:
>> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>>
>> CXL spec 3.2 section 8.2.10.9.11.1 describes the device patrol scrub
[...]
>> diff --git a/drivers/cxl/core/edac.c b/drivers/cxl/core/edac.c new
>> file mode 100644 index 000000000000..5ec3535785e1
>> --- /dev/null
>> +++ b/drivers/cxl/core/edac.c
>> @@ -0,0 +1,474 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * CXL EDAC memory feature driver.
>> + *
>> + * Copyright (c) 2024-2025 HiSilicon Limited.
>> + *
>> + * - Supports functions to configure EDAC features of the
>> + * CXL memory devices.
>> + * - Registers with the EDAC device subsystem driver to expose
>> + * the features sysfs attributes to the user for configuring
>> + * CXL memory RAS feature.
>> + */
>> +
>> +#include <linux/cleanup.h>
>> +#include <linux/edac.h>
>> +#include <linux/limits.h>
>> +#include <cxl/features.h>
>> +#include <cxl.h>
>> +#include <cxlmem.h>
>> +#include "core.h"
>> +
>> +#define CXL_NR_EDAC_DEV_FEATURES 1
>> +
>> +static struct rw_semaphore *cxl_acquire(struct rw_semaphore *rwsem) {
>> + if (down_read_interruptible(rwsem))
>> + return NULL;
>> +
>> + return rwsem;
>> +}
>> +
>> +DEFINE_FREE(cxl_unlock, struct rw_semaphore *, if (_T) up_read(_T))
>
>I know I suggested cxl_acquire() and cxl_unlock(), but this really is a generic
>facility.
>
>Let's call it rwsem_read_intr_acquire() and rwsem_read_release(), and I'll
>follow up later with Peter to see if he wants this to graduate from CXL.
>
>Also, go ahead and define it in cxl.h for now as I think other places in the
>subsystem could benefit from this approach.

Hi Dan,

Thanks for the comments.
Sure. should these definitions in cxl.h require in a separate patch?

>
>> +
>> +/*
>> + * CXL memory patrol scrub control
>> + */
>> +struct cxl_patrol_scrub_context {
>
>I like "patrol_scrub" spelled out here compared to "ps" used everywhere else.
Will change.

>
>> + u8 instance;
>> + u16 get_feat_size;
>> + u16 set_feat_size;
>> + u8 get_version;
>> + u8 set_version;
>> + u16 effects;
>> + struct cxl_memdev *cxlmd;
>> + struct cxl_region *cxlr;
>> +};
>> +
>> +/**
>> + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data
>structure.
>> + * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub.
>> + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is
>changeable.
>> + * @scrub_cycle_hrs: [IN] Requested patrol scrub cycle in hours.
>> + * [OUT] Current patrol scrub cycle in hours.
>> + * @min_scrub_cycle_hrs:[OUT] minimum patrol scrub cycle in hours
>supported.
>> + */
>> +struct cxl_memdev_ps_params {
>> + bool enable;
>> + bool scrub_cycle_changeable;
>
>This is set but unused. Even if it were to be used I would expect it to be set in the
>cxl_patrol_scrub_context.
I will add to cxl_patrol_scrub_context and will add an extra check against this when
user request to change the scrub rate.
>
>> + u8 scrub_cycle_hrs;
>> + u8 min_scrub_cycle_hrs;
>> +};
>
>I do not understand the point of this extra object and would prefer to keep
>intermediate data structures to a minimum.
>
>It looks like all this does is provide for short lived parsed caching of the raw
>hardware patrol scrube attributes. Just pass those raw objects around and
>provide helpers to do the conversion.

Sure. Will do. Was added to avoid more number of function parameters.
>
>The less data structures the less confusion for the next person that has to read
>this code a few years down the road.
>
>> +
>> +enum cxl_scrub_param {
>> + CXL_PS_PARAM_ENABLE,
>> + CXL_PS_PARAM_SCRUB_CYCLE,
>> +};
>
>This seems unforuntate, why not make non-zero scrub rate an implied enable
>and zero to disable? A non-zero sentinel value like U32_MAX can indicate "keep
>scrub rate unchanged".

These enums can be removed with remove using cxl_memdev_ps_params.

>
>> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0)
>
>This CXL_MEMDEV_PS prefix is awkward due to overload with 'struct
>cxl_memdev'. Minimize it to something like:
>
>CXL_SCRUB_CONTROL_CHANGEABLE
>CXL_SCRUB_CONTROL_REALTIME
>CXL_SCRUB_CONTROL_CYCLE_MASK
>CXL_SCRUB_CONTROL_MIN_CYCLE_MASK

Will change.
>
>> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK
>BIT(1)
>> +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0)
>#define
>> +CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8) #define
>> +CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0)
>
>CXL_SCRUB_CONTROL_ENABLE
>
>...no need to call it a mask when it is just a single-bit, and when it is both the
>status and the control just call it "enable".

Sure. Will change.
>
>> +
>> +/*
>> + * See CXL spec rev 3.2 @8.2.10.9.11.1 Table 8-222 Device Patrol
>> +Scrub Control
>> + * Feature Readable Attributes.
>> + */
>> +struct cxl_memdev_ps_rd_attrs {
>> + u8 scrub_cycle_cap;
>> + __le16 scrub_cycle_hrs;
>
>"hours" is just 2 more characters than "hrs", I think we can afford the extra
>bytes.

Will change.
>
>[..]
>> +int devm_cxl_memdev_edac_register(struct cxl_memdev *cxlmd) {
>> + struct edac_dev_feature ras_features[CXL_NR_EDAC_DEV_FEATURES];
>> + int num_ras_features = 0;
>> + u8 scrub_inst = 0;
>> + int rc;
>> +
>> + rc = cxl_memdev_scrub_init(cxlmd, &ras_features[num_ras_features],
>> + scrub_inst);
>> + if (rc < 0 && rc != -EOPNOTSUPP)
>> + return rc;
>> +
>> + if (rc != -EOPNOTSUPP)
>> + num_ras_features++;
>> +
>> + char *cxl_dev_name __free(kfree) =
>> + kasprintf(GFP_KERNEL, "cxl_%s", dev_name(&cxlmd->dev));
>
>if (!cxl_dev_name)
> return -ENOMEM;

Will add.
>
>> +
>> + return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL,
>> + num_ras_features, ras_features); }
>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_memdev_edac_register, "CXL");
>> +
>> +int devm_cxl_region_edac_register(struct cxl_region *cxlr) {
>> + struct edac_dev_feature ras_features[CXL_NR_EDAC_DEV_FEATURES];
>> + int num_ras_features = 0;
>> + u8 scrub_inst = 0;
>> + int rc;
>> +
>> + rc = cxl_region_scrub_init(cxlr, &ras_features[num_ras_features],
>> + scrub_inst);
>> + if (rc < 0)
>> + return rc;
>> +
>> + num_ras_features++;
>> +
>> + char *cxl_dev_name __free(kfree) =
>> + kasprintf(GFP_KERNEL, "cxl_%s", dev_name(&cxlr->dev));
>> +
>> + return edac_dev_register(&cxlr->dev, cxl_dev_name, NULL,
>> + num_ras_features, ras_features); }
>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_region_edac_register, "CXL");
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index b3260d433ec7..2aa6eb675fdf 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -3542,6 +3542,11 @@ static int cxl_region_probe(struct device *dev)
>> case CXL_PARTMODE_PMEM:
>> return devm_cxl_add_pmem_region(cxlr);
>> case CXL_PARTMODE_RAM:
>> + rc = devm_cxl_region_edac_register(cxlr);
>
>Why do only volatile regions get EDAC support? PMEM patrol scrub seems
>equally valid.

Will add devm_cxl_region_edac_register () for PMEM as well.

Thanks,
Shiju