RE: [PATCH 3/3] cxl/core: Add sysfs attribute get_poison for list retrieval
From: Dan Williams
Date: Fri Jun 17 2022 - 14:42:20 EST
alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@xxxxxxxxx>
>
> The sysfs attribute, get_poison, allows user space to request the
> retrieval of a CXL devices poison list for its persistent memory.
If the device supports get poison list for volatile memory, just grab
that too. With the "to be released soon" region patches userspace can
trivially translate DPA addresses to media type.
>
> From Documentation/ABI/.../sysfs-bus-cxl
> (WO) When a '1' is written to this attribute the memdev
> driver retrieves the poison list from the device. The list
> includes addresses that are poisoned or would result in
> poison if accessed, and the source of the poison. This
> attribute is only visible for devices supporting the
> capability. The retrieved errors are logged as kernel
> trace events with the label: cxl_poison_list.
>
> Signed-off-by: Alison Schofield <alison.schofield@xxxxxxxxx>
> ---
> Documentation/ABI/testing/sysfs-bus-cxl | 13 ++++++++++
> drivers/cxl/core/memdev.c | 32 +++++++++++++++++++++++++
> 2 files changed, 45 insertions(+)
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index 7c2b846521f3..9d0c3988fdd2 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -163,3 +163,16 @@ Description:
> memory (type-3). The 'target_type' attribute indicates the
> current setting which may dynamically change based on what
> memory regions are activated in this decode hierarchy.
> +
> +What: /sys/bus/cxl/devices/memX/get_poison
> +Date: June, 2022
> +KernelVersion: v5.20
> +Contact: linux-cxl@xxxxxxxxxxxxxxx
> +Description:
> + (WO) When a '1' is written to this attribute the memdev
> + driver retrieves the poison list from the device. The list
> + includes addresses that are poisoned or would result in
> + poison if accessed, and the source of the poison. This
> + attribute is only visible for devices supporting the
> + capability. The retrieved errors are logged as kernel
> + trace events with the label: cxl_poison_list.
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index f7cdcd33504a..5ef9ffaa934a 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -106,12 +106,34 @@ static ssize_t numa_node_show(struct device *dev, struct device_attribute *attr,
> }
> static DEVICE_ATTR_RO(numa_node);
>
> +static ssize_t get_poison_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +
> +{
> + int rc;
> +
> + if (!sysfs_streq(buf, "1")) {
kstrtobool()?
> + dev_err(dev, "%s: unknown value: %s\n", attr->attr.name, buf);
dev_err() is overkill for sysfs errors. dev_dbg() can be nice for errors
that trigger deep within the kernel in response to a sysfs write. In
this case EINVAL return is sufficient.
> + return -EINVAL;
> + }
> +
> + rc = cxl_mem_get_poison_list(dev);
> + if (rc) {
> + dev_err(dev, "Failed to retrieve poison list %d\n", rc);
Too chatty, dev_dbg() or delete.
> + return rc;
> + }
> + return len;
> +}
> +static DEVICE_ATTR_WO(get_poison);
> +
> static struct attribute *cxl_memdev_attributes[] = {
> &dev_attr_serial.attr,
> &dev_attr_firmware_version.attr,
> &dev_attr_payload_max.attr,
> &dev_attr_label_storage_size.attr,
> &dev_attr_numa_node.attr,
> + &dev_attr_get_poison.attr,
> NULL,
> };
>
> @@ -130,6 +152,16 @@ static umode_t cxl_memdev_visible(struct kobject *kobj, struct attribute *a,
> {
> if (!IS_ENABLED(CONFIG_NUMA) && a == &dev_attr_numa_node.attr)
> return 0;
> +
> + if (a == &dev_attr_get_poison.attr) {
> + struct device *dev = container_of(kobj, struct device, kobj);
Use the kobj_to_dev() helper.
> + struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +
> + if (!test_bit(CXL_MEM_COMMAND_ID_GET_POISON,
> + cxlds->enabled_cmds))
> + return 0;
> + }
> return a->mode;
> }
>
> --
> 2.31.1
>