Re: [PATCH v13 10/18] cxl/memfeature: Add CXL memory device patrol scrub control feature

From: Jonathan Cameron
Date: Mon Oct 14 2024 - 11:32:57 EST


On Wed, 9 Oct 2024 13:41:11 +0100
<shiju.jose@xxxxxxxxxx> wrote:

> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>
> CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub control
> feature. The device patrol scrub proactively locates and makes corrections
> to errors in regular cycle.
>
> Allow specifying the number of hours within which the patrol scrub must be
> completed, subject to minimum and maximum limits reported by the device.
> Also allow disabling scrub allowing trade-off error rates against
> performance.
>
> Add support for CXL memory device based patrol scrub control.
> Register with EDAC device driver , which gets the scrub attr descriptors
> from EDAC scrub and exposes sysfs scrub control attributes to the
> userspace. For example CXL device based scrub control for the CXL mem0
> device is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/
>
> Also add support for region based CXL memory patrol scrub control.
> CXL memory region may be interleaved across one or more CXL memory devices.
> For example region based scrub control for CXL region1 is exposed in
> /sys/bus/edac/devices/cxl_region1/scrubX/
>
> Open Questions:
> Q1: CXL 3.1 spec defined patrol scrub control feature at CXL memory devices
> with supporting set scrub cycle and enable/disable scrub. but not based on
> HPA range. Thus presently scrub control for a region is implemented based
> on all associated CXL memory devices.

That is exactly what I'd expect.

> What is the exact use case for the CXL region based scrub control?
> How the HPA range, which Dan asked for region based scrubbing is used?
> Does spec change is required for patrol scrub control feature with support
> for setting the HPA range?

Can't discuss future spec here :( + we should support current specification
even if it is changing (can't say if it is!)

This came up at LPC briefly. The HPA range is only useful as a userspace
short cut to find the right control. So not necessary initially for
the reason you state - we can't control it.

Whilst we may scrub by region, it's just a way to control scrubbing of
a set of interleaved devices. So what you have here is fine as it
stands.

>
> Q2: Both CXL device based and CXL region based scrub control would be
> enabled at the same time in a system?

Typically no, but we should make the interface do something consistent.

1) Go with highest scrub frequency requested via either path.
2) Go with latest scrub frequency to be requested.

Given it is a corner case I don't think we care which.

The device based scrub is appropriate for 'pre use' scrub control
to find out if we have dodgy hardware.
Region scrub is the logical thing to do once it is in use. In some
cases the region will include the whole of all devices in an interleave
set.

So I don't see either of these questions as a blocker on current
implementation.

>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
A few trivial things inline.

> ---
> Documentation/edac/edac-scrub.rst | 74 ++++++
> drivers/cxl/Kconfig | 18 ++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/memfeature.c | 383 ++++++++++++++++++++++++++++++
> drivers/cxl/core/region.c | 6 +
> drivers/cxl/cxlmem.h | 7 +
> drivers/cxl/mem.c | 4 +
> 7 files changed, 493 insertions(+)
> create mode 100644 Documentation/edac/edac-scrub.rst
> create mode 100644 drivers/cxl/core/memfeature.c
>
> diff --git a/Documentation/edac/edac-scrub.rst b/Documentation/edac/edac-scrub.rst
> new file mode 100644
> index 000000000000..243035957e99
> --- /dev/null
> +++ b/Documentation/edac/edac-scrub.rst
> @@ -0,0 +1,74 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===================
> +EDAC Scrub control
> +===================
> +
> +Copyright (c) 2024 HiSilicon Limited.
> +
> +:Author: Shiju Jose <shiju.jose@xxxxxxxxxx>
> +:License: The GNU Free Documentation License, Version 1.2
> + (dual licensed under the GPL v2)
> +:Original Reviewers:
> +
> +- Written for: 6.12

Update to 6.13

> +- Updated for:

> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 99b5c25be079..b717a152d2a5 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -145,4 +145,22 @@ config CXL_REGION_INVALIDATION_TEST
> If unsure, or if this kernel is meant for production environments,
> say N.
>
> +config CXL_RAS_FEAT
> + tristate "CXL: Memory RAS features"
> + depends on CXL_PCI
> + depends on CXL_MEM
> + depends on EDAC
> + help
> + The CXL memory RAS feature control is optional allows host to control
> + the RAS features configurations of CXL Type 3 devices.
> +
> + Registers with the EDAC device subsystem to expose control attributes
> + of CXL memory device's RAS features to the user.
> + Provides interface functions to support configuring the CXL memory
> + device's RAS features.
> +
> + Say 'y/n' to enable/disable CXL.mem device'ss RAS features control.

's or s' but not 'ss
(singular or plural forms)

> + See section 8.2.9.9.11 of CXL 3.1 specification for the detailed
> + information of CXL memory device features.
> +
> endif

> diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c
> new file mode 100644
> index 000000000000..84d6e887a4fa
> --- /dev/null
> +++ b/drivers/cxl/core/memfeature.c
> @@ -0,0 +1,383 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * CXL memory RAS feature driver.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + *
> + * - Supports functions to configure RAS features of the
> + * CXL memory devices.
> + * - Registers with the EDAC device subsystem driver to expose
> + * the features sysfs attributes to the user for configuring
> + * CXL memory RAS feature.
> + */
> +
> +#define pr_fmt(fmt) "CXL MEM FEAT: " fmt
> +
> +#include <cxlmem.h>
> +#include <linux/cleanup.h>
> +#include <linux/limits.h>
> +#include <cxl.h>

Reorder includes to put the cxl ones at the end and others
in alphabetical order.

> +#include <linux/edac.h>
>
> +static int cxl_ps_get_attrs(struct device *dev, void *drv_data,
> + struct cxl_memdev_ps_params *params)
> +{
> + struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
> + struct cxl_memdev *cxlmd;
> + struct cxl_dev_state *cxlds;
> + struct cxl_memdev_state *mds;
> + u16 min_scrub_cycle = 0;
> + int i, ret;
> +
> + if (cxl_ps_ctx->cxlr) {
> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
> + struct cxl_region_params *p = &cxlr->params;
> +
> + for (i = p->interleave_ways - 1; i >= 0; i--) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> +
> + cxlmd = cxled_to_memdev(cxled);
> + cxlds = cxlmd->cxlds;
> + mds = to_cxl_memdev_state(cxlds);
> + ret = cxl_mem_ps_get_attrs(mds, params);
> + if (ret)
> + return ret;
> +
> + if (params->min_scrub_cycle_hrs > min_scrub_cycle)
> + min_scrub_cycle = params->min_scrub_cycle_hrs;
> + }
> + params->min_scrub_cycle_hrs = min_scrub_cycle;
> + return 0;
> + }
> + cxlmd = cxl_ps_ctx->cxlmd;
> + cxlds = cxlmd->cxlds;
> + mds = to_cxl_memdev_state(cxlds);
> +
See below - this is the similar example I refer to.

> + return cxl_mem_ps_get_attrs(mds, params);
> +}

> +
> +static int cxl_ps_set_attrs(struct device *dev, void *drv_data,
> + struct cxl_memdev_ps_params *params,
> + enum cxl_scrub_param param_type)
> +{
> + struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
> + struct cxl_memdev *cxlmd;
> + struct cxl_dev_state *cxlds;
> + struct cxl_memdev_state *mds;
> + int ret, i;
> +
> + if (cxl_ps_ctx->cxlr) {
> + struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
> + struct cxl_region_params *p = &cxlr->params;
> +
> + for (i = p->interleave_ways - 1; i >= 0; i--) {
> + struct cxl_endpoint_decoder *cxled = p->targets[i];
> +
> + cxlmd = cxled_to_memdev(cxled);
> + cxlds = cxlmd->cxlds;
> + mds = to_cxl_memdev_state(cxlds);
> + ret = cxl_mem_ps_set_attrs(dev, drv_data, mds,
> + params, param_type);
> + if (ret)
> + return ret;
> + }

Maybe return here?

> + } else {
> + cxlmd = cxl_ps_ctx->cxlmd;
> + cxlds = cxlmd->cxlds;
> + mds = to_cxl_memdev_state(cxlds);
> +
> + return cxl_mem_ps_set_attrs(dev, drv_data, mds, params, param_type);

Then indent of this hunk can drop. Similar to the case above.

> + }
> +
> + return 0;
> +}