Re: [PATCH v3 2/4] edac: Add support for Amazon's Annapurna Labs L1 EDAC

From: Robert Richter
Date: Tue Sep 03 2019 - 03:25:13 EST


On 15.07.19 16:24:07, Hanna Hawa wrote:
> Adds support for Amazon's Annapurna Labs L1 EDAC driver to detect and
> report L1 errors.
>
> Signed-off-by: Hanna Hawa <hhhawa@xxxxxxxxxx>
> Reviewed-by: James Morse <james.morse@xxxxxxx>
> ---
> MAINTAINERS | 6 ++
> drivers/edac/Kconfig | 8 +++
> drivers/edac/Makefile | 1 +
> drivers/edac/al_l1_edac.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 171 insertions(+)
> create mode 100644 drivers/edac/al_l1_edac.c

> diff --git a/drivers/edac/al_l1_edac.c b/drivers/edac/al_l1_edac.c
> new file mode 100644
> index 0000000..70510ea
> --- /dev/null
> +++ b/drivers/edac/al_l1_edac.c

[...]

> +static void al_l1_edac_cpumerrsr(void *arg)

Could this being named to something meaningful, such as
*_read_status() or so?

> +{
> + struct edac_device_ctl_info *edac_dev = arg;
> + int cpu, i;
> + u32 ramid, repeat, other, fatal;
> + u64 val = read_sysreg_s(ARM_CA57_CPUMERRSR_EL1);
> + char msg[AL_L1_EDAC_MSG_MAX];
> + int space, count;
> + char *p;
> +
> + if (!(FIELD_GET(ARM_CA57_CPUMERRSR_VALID, val)))
> + return;

[...]

> +static void al_l1_edac_check(struct edac_device_ctl_info *edac_dev)
> +{
> + on_each_cpu(al_l1_edac_cpumerrsr, edac_dev, 1);
> +}
> +
> +static int al_l1_edac_probe(struct platform_device *pdev)
> +{
> + struct edac_device_ctl_info *edac_dev;
> + struct device *dev = &pdev->dev;
> + int ret;
> +
> + edac_dev = edac_device_alloc_ctl_info(0, (char *)dev_name(dev), 1, "L",

This type cast looks broken. dev_name() is a constant string already.

Other drivers do not use the dynamically generated dev_name() string
here, instead a fix string such as mod_name or ctl_name could be used.
edac_device_alloc_ctl_info() later generates a unique instance name
derived from name + index.

Regarding the type, this seems to be an API issue of edac_device_
alloc_ctl_info() that should actually use const char* in its
interface. So if needed (from what I wrote above it is not) the type
in the argument list needs to be changed instead.

> + 1, 1, NULL, 0,
> + edac_device_alloc_index());
> + if (IS_ERR(edac_dev))
> + return -ENOMEM;

Use the original error code instead.

> +
> + edac_dev->edac_check = al_l1_edac_check;
> + edac_dev->dev = dev;
> + edac_dev->mod_name = DRV_NAME;
> + edac_dev->dev_name = dev_name(dev);
> + edac_dev->ctl_name = "L1 cache";

Should not contain spaces and maybe a bit more specific.

> + platform_set_drvdata(pdev, edac_dev);
> +
> + ret = edac_device_add_device(edac_dev);
> + if (ret) {
> + dev_err(dev, "Failed to add L1 edac device\n");

Move this printk below to the error path and maybe print the error
code. You do not cover the -ENOMEM failure.

-Robert

> + goto err;
> + }
> +
> + return 0;
> +err:
> + edac_device_free_ctl_info(edac_dev);
> +
> + return ret;
> +}