Re: [PATCH v12 02/17] EDAC: Add EDAC scrub control driver

From: Borislav Petkov
Date: Fri Sep 13 2024 - 13:26:14 EST


On Wed, Sep 11, 2024 at 10:04:31AM +0100, shiju.jose@xxxxxxxxxx wrote:
> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>
> Add generic EDAC scrub control driver supports configuring the memory scrubbers

s/supports configuring the/in order to configure/

> in the system. The device with scrub feature, get the scrub descriptor from the
> EDAC scrub and registers with the EDAC RAS feature driver, which adds the sysfs
> scrub control interface.

That sentence reads wrong.

> The scrub control attributes for a scrub instance are
> available to userspace in /sys/bus/edac/devices/<dev-name>/scrub*/.
>
> Generic EDAC scrub driver and the common sysfs scrub interface promotes
> unambiguous access from the userspace irrespective of the underlying scrub
> devices.

Huh?

Do you wanna say something along the lines that the common sysfs scrub
interface abstracts the control of an arbitrary scrubbing functionality into
a common set of functions or so?

> The sysfs scrub attribute nodes would be present only if the client driver
> has implemented the corresponding attribute callback function and pass in ops
> to the EDAC RAS feature driver during registration.
>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
> ---
> Documentation/ABI/testing/sysfs-edac-scrub | 69 ++++
> drivers/edac/Makefile | 1 +
> drivers/edac/edac_device.c | 6 +-
> drivers/edac/edac_scrub.c | 377 +++++++++++++++++++++
> include/linux/edac.h | 30 ++
> 5 files changed, 482 insertions(+), 1 deletion(-)
> create mode 100644 Documentation/ABI/testing/sysfs-edac-scrub
> create mode 100755 drivers/edac/edac_scrub.c
>
> diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub
> new file mode 100644
> index 000000000000..f465cc91423f
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-edac-scrub

...

> +What: /sys/bus/edac/devices/<dev-name>/scrub*/current_cycle_duration
> +Date: Oct 2024
> +KernelVersion: 6.12
> +Contact: linux-edac@xxxxxxxxxxxxxxx
> +Description:
> + (RW) The current scrub cycle duration in seconds and must be
> + within the supported range by the memory scrubber.

So in reading about that interface, where is the user doc explaining how one
should use scrubbers?

> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index 4edfb83ffbee..fbf0e39ec678 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o
>
> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o
> edac_core-y += edac_module.o edac_device_sysfs.o wq.o
> +edac_core-y += edac_scrub.o

Just scrub.[co]. The file is already in drivers/edac/. Too many "edac"
strings. :)

>
> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o
>
> diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
> index e4a5d010ea2d..6381896b6424 100644
> --- a/drivers/edac/edac_device.c
> +++ b/drivers/edac/edac_device.c
> @@ -608,12 +608,16 @@ static int edac_dev_feat_init(struct device *parent,
> const struct edac_dev_feature *ras_feat,
> const struct attribute_group **attr_groups)
> {
> - int num;
> + int num, ret;
>
> switch (ras_feat->ft_type) {
> case RAS_FEAT_SCRUB:
> dev_data->scrub_ops = ras_feat->scrub_ops;
> dev_data->private = ras_feat->ctx;
> + ret = edac_scrub_get_desc(parent, attr_groups,
> + ras_feat->instance);
> + if (ret)
> + return ret;
> return 1;
> case RAS_FEAT_ECS:
> num = ras_feat->ecs_info.num_media_frus;
> diff --git a/drivers/edac/edac_scrub.c b/drivers/edac/edac_scrub.c
> new file mode 100755
> index 000000000000..3f8f37629acf
> --- /dev/null
> +++ b/drivers/edac/edac_scrub.c
> @@ -0,0 +1,377 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Generic EDAC scrub driver supports controlling the memory
> + * scrubbers in the system and the common sysfs scrub interface
> + * promotes unambiguous access from the userspace.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + */
> +
> +#define pr_fmt(fmt) "EDAC SCRUB: " fmt
> +
> +#include <linux/edac.h>
> +
> +enum edac_scrub_attributes {
> + SCRUB_ADDR_RANGE_BASE,
> + SCRUB_ADDR_RANGE_SIZE,
> + SCRUB_ENABLE_BACKGROUND,
> + SCRUB_ENABLE_ON_DEMAND,
> + SCRUB_MIN_CYCLE_DURATION,
> + SCRUB_MAX_CYCLE_DURATION,
> + SCRUB_CURRENT_CYCLE_DURATION,
> + SCRUB_MAX_ATTRS
> +};
> +
> +struct edac_scrub_dev_attr {
> + struct device_attribute dev_attr;
> + u8 instance;
> +};
> +
> +struct edac_scrub_context {
> + char name[EDAC_FEAT_NAME_LEN];
> + struct edac_scrub_dev_attr scrub_dev_attr[SCRUB_MAX_ATTRS];
> + struct attribute *scrub_attrs[SCRUB_MAX_ATTRS + 1];
> + struct attribute_group group;
> +};
> +
> +#define to_scrub_dev_attr(_dev_attr) \
> + container_of(_dev_attr, struct edac_scrub_dev_attr, dev_attr)
> +
> +static ssize_t addr_range_base_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + u64 base, size;
> + int ret;
> +
> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub[inst].private, &base, &size);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", base);
> +}
> +
> +static ssize_t addr_range_size_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + u64 base, size;
> + int ret;
> +
> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub[inst].private, &base, &size);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "0x%llx\n", size);
> +}
> +
> +static ssize_t addr_range_base_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + u64 base, size;
> + int ret;
> +
> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub[inst].private, &base, &size);
> + if (ret)
> + return ret;

> +
> + ret = kstrtou64(buf, 0, &base);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->write_range(ras_feat_dev->parent, ctx->scrub[inst].private, base, size);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t addr_range_size_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf,
> + size_t len)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + u64 base, size;
> + int ret;
> +
> + ret = ops->read_range(ras_feat_dev->parent, ctx->scrub[inst].private, &base, &size);
> + if (ret)
> + return ret;
> +

Can all that repetitive code be abstracted away in macros pls?

Below too.

> + ret = kstrtou64(buf, 0, &size);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->write_range(ras_feat_dev->parent, ctx->scrub[inst].private, base, size);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t enable_background_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + bool enable;
> + int ret;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->set_enabled_bg(ras_feat_dev->parent, ctx->scrub[inst].private, enable);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t enable_background_show(struct device *ras_feat_dev,
> + struct device_attribute *attr, char *buf)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + bool enable;
> + int ret;
> +
> + ret = ops->get_enabled_bg(ras_feat_dev->parent, ctx->scrub[inst].private, &enable);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "%d\n", enable);
> +}
> +
> +static ssize_t enable_on_demand_show(struct device *ras_feat_dev,
> + struct device_attribute *attr, char *buf)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + bool enable;
> + int ret;
> +
> + ret = ops->get_enabled_od(ras_feat_dev->parent, ctx->scrub[inst].private, &enable);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "%d\n", enable);
> +}
> +
> +static ssize_t enable_on_demand_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + bool enable;
> + int ret;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->set_enabled_od(ras_feat_dev->parent, ctx->scrub[inst].private, enable);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static ssize_t min_cycle_duration_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + u32 val;
> + int ret;
> +
> + ret = ops->min_cycle_read(ras_feat_dev->parent, ctx->scrub[inst].private, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "%u\n", val);
> +}
> +
> +static ssize_t max_cycle_duration_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + u32 val;
> + int ret;
> +
> + ret = ops->max_cycle_read(ras_feat_dev->parent, ctx->scrub[inst].private, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "%u\n", val);
> +}
> +
> +static ssize_t current_cycle_duration_show(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + u32 val;
> + int ret;
> +
> + ret = ops->cycle_duration_read(ras_feat_dev->parent, ctx->scrub[inst].private, &val);
> + if (ret)
> + return ret;
> +
> + return sysfs_emit(buf, "%u\n", val);
> +}
> +
> +static ssize_t current_cycle_duration_store(struct device *ras_feat_dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> + long val;
> + int ret;
> +
> + ret = kstrtol(buf, 0, &val);
> + if (ret < 0)
> + return ret;
> +
> + ret = ops->cycle_duration_write(ras_feat_dev->parent, ctx->scrub[inst].private, val);
> + if (ret)
> + return ret;
> +
> + return len;
> +}
> +
> +static umode_t scrub_attr_visible(struct kobject *kobj,
> + struct attribute *a, int attr_id)
> +{
> + struct device *ras_feat_dev = kobj_to_dev(kobj);
> + struct device_attribute *dev_attr =
> + container_of(a, struct device_attribute, attr);

No silly linebreaks like that pls. Check your whole patchset.

> + u8 inst = ((struct edac_scrub_dev_attr *)to_scrub_dev_attr(dev_attr))->instance;
> + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev);
> + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops;
> +
> + switch (attr_id) {
> + case SCRUB_ADDR_RANGE_BASE:
> + case SCRUB_ADDR_RANGE_SIZE:
> + if (ops->read_range && ops->write_range)
> + return a->mode;
> + if (ops->read_range)
> + return 0444;

if (...read_range) {
if (...write_range)
return a->mode;
else
return 0444:
}
break;

and now put a single "return 0;" at the end of the function.

Below too.

> + return 0;
> + case SCRUB_ENABLE_BACKGROUND:
> + if (ops->get_enabled_bg && ops->set_enabled_bg)
> + return a->mode;
> + if (ops->get_enabled_bg)
> + return 0444;
> + return 0;
> + case SCRUB_ENABLE_ON_DEMAND:
> + if (ops->get_enabled_od && ops->set_enabled_od)
> + return a->mode;
> + if (ops->get_enabled_od)
> + return 0444;
> + return 0;
> + case SCRUB_MIN_CYCLE_DURATION:
> + return ops->min_cycle_read ? a->mode : 0;

if (ops->min_cycle_read)
return a->mode;

> + case SCRUB_MAX_CYCLE_DURATION:
> + return ops->max_cycle_read ? a->mode : 0;
> + case SCRUB_CURRENT_CYCLE_DURATION:
> + if (ops->cycle_duration_read && ops->cycle_duration_write)
> + return a->mode;
> + if (ops->cycle_duration_read)
> + return 0444;
> + return 0;
> + default:
> + return 0;
> + }
> +}
> +
> +#define EDAC_SCRUB_ATTR_RO(_name, _instance) \
> + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RO(_name), \
> + .instance = _instance })
> +
> +#define EDAC_SCRUB_ATTR_WO(_name, _instance) \
> + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_WO(_name), \
> + .instance = _instance })
> +
> +#define EDAC_SCRUB_ATTR_RW(_name, _instance) \
> + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RW(_name), \
> + .instance = _instance })
> +
> +static int scrub_create_desc(struct device *scrub_dev,
> + const struct attribute_group **attr_groups,
> + u8 instance)
> +{
> + struct edac_scrub_context *scrub_ctx;
> + struct attribute_group *group;
> + int i;
> +
> + scrub_ctx = devm_kzalloc(scrub_dev, sizeof(*scrub_ctx), GFP_KERNEL);
> + if (!scrub_ctx)
> + return -ENOMEM;
> +
> + group = &scrub_ctx->group;
> + scrub_ctx->scrub_dev_attr[0] = EDAC_SCRUB_ATTR_RW(addr_range_base, instance);
> + scrub_ctx->scrub_dev_attr[1] = EDAC_SCRUB_ATTR_RW(addr_range_size, instance);
> + scrub_ctx->scrub_dev_attr[2] = EDAC_SCRUB_ATTR_RW(enable_background, instance);
> + scrub_ctx->scrub_dev_attr[3] = EDAC_SCRUB_ATTR_RW(enable_on_demand, instance);
> + scrub_ctx->scrub_dev_attr[4] = EDAC_SCRUB_ATTR_RO(min_cycle_duration, instance);
> + scrub_ctx->scrub_dev_attr[5] = EDAC_SCRUB_ATTR_RO(max_cycle_duration, instance);
> + scrub_ctx->scrub_dev_attr[6] = EDAC_SCRUB_ATTR_RW(current_cycle_duration, instance);

Why use the naked numbers when you have enum edac_scrub_attributes?

> + for (i = 0; i < SCRUB_MAX_ATTRS; i++)
> + scrub_ctx->scrub_attrs[i] = &scrub_ctx->scrub_dev_attr[i].dev_attr.attr;
> +
> + sprintf(scrub_ctx->name, "%s%d", "scrub", instance);
> + group->name = scrub_ctx->name;
> + group->attrs = scrub_ctx->scrub_attrs;
> + group->is_visible = scrub_attr_visible;
> +
> + attr_groups[0] = group;
> +
> + return 0;
> +}
> +
> +/**
> + * edac_scrub_get_desc - get EDAC scrub descriptors
> + * @scrub_dev: client device, with scrub support
> + * @attr_groups: pointer to attrribute group container

+ * @attr_groups: pointer to attrribute group container
Unknown word [attrribute] in comment.
Suggestions: ['attribute', 'attributed', 'attributes', "attribute's", 'attributive', 'tribute']

Please introduce a spellchecker into your patch creation workflow.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette