RE: [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem

From: Shiju Jose
Date: Thu Apr 25 2024 - 14:11:27 EST


Hi Boris,

Thanks for the feedbacks.

Please find reply inline,

Thanks,
Shiju
>-----Original Message-----
>From: Borislav Petkov <bp@xxxxxxxxx>
>Sent: 25 April 2024 11:16
>To: Shiju Jose <shiju.jose@xxxxxxxxxx>
>Cc: linux-cxl@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; linux-
>mm@xxxxxxxxx; dan.j.williams@xxxxxxxxx; dave@xxxxxxxxxxxx; Jonathan
>Cameron <jonathan.cameron@xxxxxxxxxx>; dave.jiang@xxxxxxxxx;
>alison.schofield@xxxxxxxxx; vishal.l.verma@xxxxxxxxx; ira.weiny@xxxxxxxxx;
>linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; david@xxxxxxxxxx;
>Vilas.Sridharan@xxxxxxx; leo.duran@xxxxxxx; Yazen.Ghannam@xxxxxxx;
>rientjes@xxxxxxxxxx; jiaqiyan@xxxxxxxxxx; tony.luck@xxxxxxxxx;
>Jon.Grimm@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; rafael@xxxxxxxxxx;
>lenb@xxxxxxxxxx; naoya.horiguchi@xxxxxxx; james.morse@xxxxxxx;
>jthoughton@xxxxxxxxxx; somasundaram.a@xxxxxxx;
>erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx; duenwen@xxxxxxxxxx;
>mike.malvestuto@xxxxxxxxx; gthelen@xxxxxxxxxx;
>wschwartz@xxxxxxxxxxxxxxxxxxx; dferguson@xxxxxxxxxxxxxxxxxxx;
>wbs@xxxxxxxxxxxxxxxxxxxxxx; nifan.cxl@xxxxxxxxx; tanxiaofei
><tanxiaofei@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
>kangkang.shen@xxxxxxxxxxxxx; wanghuiqiang <wanghuiqiang@xxxxxxxxxx>;
>Linuxarm <linuxarm@xxxxxxxxxx>
>Subject: Re: [RFC PATCH v8 01/10] ras: scrub: Add scrub subsystem
>
>On Sat, Apr 20, 2024 at 12:47:10AM +0800, shiju.jose@xxxxxxxxxx wrote:
>> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>>
>> Add scrub subsystem supports configuring the memory scrubbers in the
>> system. The scrub subsystem provides the interface for registering the
>> scrub devices. The scrub control attributes are provided to the user
>> in /sys/class/ras/rasX/scrub
>>
>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
>> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
>> ---
>> .../ABI/testing/sysfs-class-scrub-configure | 47 +++
>> drivers/ras/Kconfig | 7 +
>> drivers/ras/Makefile | 1 +
>> drivers/ras/memory_scrub.c | 271 ++++++++++++++++++
>> include/linux/memory_scrub.h | 37 +++
>> 5 files changed, 363 insertions(+)
>> create mode 100644
>> Documentation/ABI/testing/sysfs-class-scrub-configure
>> create mode 100755 drivers/ras/memory_scrub.c create mode 100755
>> include/linux/memory_scrub.h
>
>ERROR: modpost: missing MODULE_LICENSE() in drivers/ras/memory_scrub.o
>make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 1
>make[1]: *** [/mnt/kernel/kernel/2nd/linux/Makefile:1871: modpost] Error 2
>make: *** [Makefile:240: __sub-make] Error 2
>
>Each patch of yours needs to build.

Fixed.

>
>> diff --git a/Documentation/ABI/testing/sysfs-class-scrub-configure
>> b/Documentation/ABI/testing/sysfs-class-scrub-configure
>> new file mode 100644
>> index 000000000000..3ed77dbb00ad
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-class-scrub-configure
>> @@ -0,0 +1,47 @@
>> +What: /sys/class/ras/
>> +Date: March 2024
>> +KernelVersion: 6.9
>> +Contact: linux-kernel@xxxxxxxxxxxxxxx
>> +Description:
>> + The ras/ class subdirectory belongs to the
>> + common ras features such as scrub subsystem.
>> +
>> +What: /sys/class/ras/rasX/scrub/
>> +Date: March 2024
>> +KernelVersion: 6.9
>> +Contact: linux-kernel@xxxxxxxxxxxxxxx
>> +Description:
>> + The /sys/class/ras/ras{0,1,2,3,...}/scrub directories
>
>You have different scrubbers.
>
>I'd prefer if you put their names in here instead and do this structure:
>
>/sys/class/ras/scrub/cxl-patrol
> /ars
> /cxl-ecs
> /acpi-ras2
>
>and so on.
>
>Unless the idea is for those devices to have multiple RAS-specific functionality
>than just scrubbing. Then you want to do
>
>/sys/class/ras/cxl/scrub
> /other_function
>
>/sys/class/ras/ars/scrub
> /...
>
>You get the idea.
It is expected to have multiple RAS-specific functionalities other than scrubbing in long run.
Most of the classes in the kernel found as /sys/class/<class-name>/<class-name>X/

If not, however /sys/class/ras/<module -name>X/<feature> is more suitable because
there are multiple device instances such as cxl devices with scrub control feature.
For example, /sys/class/ras/cxlX/scrub

>
>> + correspond to each scrub device registered with the
>> + scrub subsystem.
>> +
>> +What: /sys/class/ras/rasX/scrub/name
>> +Date: March 2024
>> +KernelVersion: 6.9
>> +Contact: linux-kernel@xxxxxxxxxxxxxxx
>> +Description:
>> + (RO) name of the memory scrubber
>> +
>> +What: /sys/class/ras/rasX/scrub/enable_background
>> +Date: March 2024
>> +KernelVersion: 6.9
>> +Contact: linux-kernel@xxxxxxxxxxxxxxx
>> +Description:
>> + (RW) Enable/Disable background(patrol) scrubbing if supported.
>> +
>> +What: /sys/class/ras/rasX/scrub/rate_available
>
>That's dumping a range so I guess it should be called probably "possible_rates"
>or so, so that it is clear what it means.
>
>If some scrubbers support only a discrete set of rate values, then
>"possible_rates" fits too if you dump them as a list of values.
Sure. Will check.

>
>> +Date: March 2024
>> +KernelVersion: 6.9
>> +Contact: linux-kernel@xxxxxxxxxxxxxxx
>> +Description:
>> + (RO) Supported range for the scrub rate by the scrubber.
>> + The scrub rate represents in hours.
>> +
>> +What: /sys/class/ras/rasX/scrub/rate
>> +Date: March 2024
>> +KernelVersion: 6.9
>> +Contact: linux-kernel@xxxxxxxxxxxxxxx
>> +Description:
>> + (RW) The scrub rate specified and it must be with in the
>> + supported range by the scrubber.
>> + The scrub rate represents in hours.
>> diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig index
>> fc4f4bb94a4c..181701479564 100644
>> --- a/drivers/ras/Kconfig
>> +++ b/drivers/ras/Kconfig
>> @@ -46,4 +46,11 @@ config RAS_FMPM
>> Memory will be retired during boot time and run time depending on
>> platform-specific policies.
>>
>> +config SCRUB
>> + tristate "Memory scrub driver"
>> + help
>> + This option selects the memory scrub subsystem, supports
>
>s/This option selects/Enable/
Sure.

>
>> + configuring the parameters of underlying scrubbers in the
>> + system for the DRAM memories.
>> +
>> endif
>> diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile index
>> 11f95d59d397..89bcf0d84355 100644
>> --- a/drivers/ras/Makefile
>> +++ b/drivers/ras/Makefile
>> @@ -2,6 +2,7 @@
>> obj-$(CONFIG_RAS) += ras.o
>> obj-$(CONFIG_DEBUG_FS) += debugfs.o
>> obj-$(CONFIG_RAS_CEC) += cec.o
>> +obj-$(CONFIG_SCRUB) += memory_scrub.o
>>
>> obj-$(CONFIG_RAS_FMPM) += amd/fmpm.o
>> obj-y += amd/atl/
>> diff --git a/drivers/ras/memory_scrub.c b/drivers/ras/memory_scrub.c
>> new file mode 100755 index 000000000000..7e995380ec3a
>> --- /dev/null
>> +++ b/drivers/ras/memory_scrub.c
>> @@ -0,0 +1,271 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Memory scrub subsystem supports configuring the registered
>> + * memory scrubbers.
>> + *
>> + * Copyright (c) 2024 HiSilicon Limited.
>> + */
>> +
>> +#define pr_fmt(fmt) "MEM SCRUB: " fmt
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/bitops.h>
>> +#include <linux/delay.h>
>> +#include <linux/kfifo.h>
>> +#include <linux/memory_scrub.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/spinlock.h>
>> +
>> +/* memory scrubber config definitions */
>
>No need for that comment.
Will remove.
>
>> +static ssize_t rate_available_show(struct device *dev,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct scrub_device *scrub_dev = to_scrub_device(dev);
>> + u64 min_sr, max_sr;
>> + int ret;
>> +
>> + ret = scrub_dev->ops->rate_avail_range(dev, &min_sr, &max_sr);
>> + if (ret)
>> + return ret;
>> +
>> + return sysfs_emit(buf, "0x%llx-0x%llx\n", min_sr, max_sr); }
>
>This glue driver will need to store the min and max scrub rates on init and
>rate_store() will have to verify the newly supplied rate is within that range
>before writing it.
>
>Not the user, nor the underlying hw driver.
Presently underlying hw driver does the check. I think this will become more
complex if does in the common rate_store() if we have to check against either a list of
possible rates or min and max rates.

>
>> +
>> +DEVICE_ATTR_RW(enable_background);
>> +DEVICE_ATTR_RO(name);
>> +DEVICE_ATTR_RW(rate);
>> +DEVICE_ATTR_RO(rate_available);
>
>static
>
>> +
>> +static struct attribute *scrub_attrs[] = {
>> + &dev_attr_enable_background.attr,
>> + &dev_attr_name.attr,
>> + &dev_attr_rate.attr,
>> + &dev_attr_rate_available.attr,
>> + NULL
>> +};
>> +
>> +static umode_t scrub_attr_visible(struct kobject *kobj,
>> + struct attribute *a, int attr_id) {
>> + struct device *dev = kobj_to_dev(kobj);
>> + struct scrub_device *scrub_dev = to_scrub_device(dev);
>> + const struct scrub_ops *ops = scrub_dev->ops;
>> +
>> + if (a == &dev_attr_enable_background.attr) {
>> + if (ops->set_enabled_bg && ops->get_enabled_bg)
>> + return a->mode;
>> + if (ops->get_enabled_bg)
>> + return 0444;
>> + return 0;
>> + }
>> + if (a == &dev_attr_name.attr)
>> + return ops->get_name ? a->mode : 0;
>> + if (a == &dev_attr_rate_available.attr)
>> + return ops->rate_avail_range ? a->mode : 0;
>> + if (a == &dev_attr_rate.attr) { /* Write only makes little sense */
>> + if (ops->rate_read && ops->rate_write)
>> + return a->mode;
>> + if (ops->rate_read)
>> + return 0444;
>> + return 0;
>> + }
>
>All of that stuff's permissions should be root-only.
Sure.

>
>> +
>> + return 0;
>> +}
>> +
>> +static const struct attribute_group scrub_attr_group = {
>> + .name = "scrub",
>> + .attrs = scrub_attrs,
>> + .is_visible = scrub_attr_visible,
>> +};
>> +
>> +static const struct attribute_group *scrub_attr_groups[] = {
>> + &scrub_attr_group,
>> + NULL
>> +};
>> +
>> +static void scrub_dev_release(struct device *dev) {
>> + struct scrub_device *scrub_dev = to_scrub_device(dev);
>> +
>> + ida_free(&scrub_ida, scrub_dev->id);
>> + kfree(scrub_dev);
>> +}
>> +
>> +static struct class scrub_class = {
>> + .name = "ras",
>> + .dev_groups = scrub_attr_groups,
>> + .dev_release = scrub_dev_release,
>> +};
>> +
>> +static struct device *
>> +scrub_device_register(struct device *parent, void *drvdata,
>> + const struct scrub_ops *ops)
>> +{
>> + struct scrub_device *scrub_dev;
>> + struct device *hdev;
>> + int err;
>> +
>> + scrub_dev = kzalloc(sizeof(*scrub_dev), GFP_KERNEL);
>> + if (!scrub_dev)
>> + return ERR_PTR(-ENOMEM);
>> + hdev = &scrub_dev->dev;
>> +
>> + scrub_dev->id = ida_alloc(&scrub_ida, GFP_KERNEL);
>
>What's that silly thing for?
This is the ras instance id (X) used for scrub control feature, /sys/class/ras/rasX/scrub/

>
>> + if (scrub_dev->id < 0) {
>> + kfree(scrub_dev);
>> + return ERR_PTR(-ENOMEM);
>> + }
>> +
>> + scrub_dev->ops = ops;
>> + hdev->class = &scrub_class;
>> + hdev->parent = parent;
>> + dev_set_drvdata(hdev, drvdata);
>> + dev_set_name(hdev, SCRUB_ID_FORMAT, scrub_dev->id);
>> + err = device_register(hdev);
>> + if (err) {
>> + put_device(hdev);
>> + return ERR_PTR(err);
>> + }
>> +
>> + return hdev;
>> +}
>> +
>> +static void devm_scrub_release(void *dev) {
>> + device_unregister(dev);
>> +}
>> +
>> +/**
>> + * devm_scrub_device_register - register scrubber device
>> + * @dev: the parent device
>> + * @drvdata: driver data to attach to the scrub device
>> + * @ops: pointer to scrub_ops structure (optional)
>> + *
>> + * Returns the pointer to the new device on success, ERR_PTR() otherwise.
>> + * The new device would be automatically unregistered with the parent
>device.
>> + */
>> +struct device *
>> +devm_scrub_device_register(struct device *dev, void *drvdata,
>> + const struct scrub_ops *ops)
>> +{
>> + struct device *hdev;
>> + int ret;
>> +
>> + if (!dev)
>> + return ERR_PTR(-EINVAL);
>> +
>> + hdev = scrub_device_register(dev, drvdata, ops);
>> + if (IS_ERR(hdev))
>> + return hdev;
>> +
>> + ret = devm_add_action_or_reset(dev, devm_scrub_release, hdev);
>> + if (ret)
>> + return ERR_PTR(ret);
>> +
>> + return hdev;
>> +}
>> +EXPORT_SYMBOL_GPL(devm_scrub_device_register);
>> +
>> +static int __init memory_scrub_control_init(void) {
>> + return class_register(&scrub_class); }
>> +subsys_initcall(memory_scrub_control_init);
>
>You can't just blindly register this thing without checking whether there are even
>any hw scrubber devices on the system.
I think it happens only when a dependent module as autoloaded based on a scrub device existing with exception of memory scrub control built in and who would build this in?

>
>--
>Regards/Gruss,
> Boris.
>
Thanks,
Shiju