Re: [PATCH v6 2/2] soc: amazon: al-pos-edac: Introduce Amazon's Annapurna Labs POS EDAC driver

From: James Morse
Date: Mon Oct 21 2019 - 12:43:18 EST


Hi Talel,

On 10/10/2019 12:41, Talel Shenhar wrote:
> The Amazon's Annapurna Labs SoCs includes Point Of Serialization error
> logging unit that reports an error in case write error (e.g . Attempt to

(This is tricky to parse. "error in case write error" -> "error when a write error occurs"?)

> write to a read only register).
> This error shall be reported to EDAC subsystem as uncorrectable-error.


> diff --git a/MAINTAINERS b/MAINTAINERS
> index 55199ef..a77d554 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -757,6 +757,13 @@ F: drivers/tty/serial/altera_jtaguart.c
> F: include/linux/altera_uart.h
> F: include/linux/altera_jtaguart.h
>
> +AMAZON ANNAPURNA LABS POS EDAC DRIVER
> +M: Talel Shenhar <talel@xxxxxxxxxx>
> +M: Talel Shenhar <talelshenhar@xxxxxxxxx>
> +S: Maintained
> +F: Documentation/devicetree/bindings/edac/amazon,al-pos-edac.yaml

> +F: drivers/edac/al-pos-edac.c

~s/-/_/


> diff --git a/drivers/edac/al_pos_edac.c b/drivers/edac/al_pos_edac.c
> new file mode 100644
> index 00000000..a85ab67
> --- /dev/null
> +++ b/drivers/edac/al_pos_edac.c
> @@ -0,0 +1,173 @@

> +static int al_pos_handle(struct al_pos_edac *al_pos)
> +{

> + log1 = readl_relaxed(al_pos->mmio_base + AL_POS_ERROR_LOG_1);
> + if (!FIELD_GET(AL_POS_ERROR_LOG_1_VALID, log1))
> + return 0;

[...]

> + edac_device_handle_ue(al_pos->edac_dev, 0, 0, msg);
> +
> + return 1;
> +}
[...]

> +static irqreturn_t al_pos_irq_handler(int irq, void *info)
> +{

> + if (al_pos_handle(al_pos))
> + return IRQ_HANDLED;
> + return IRQ_NONE;
> +}


> +static int al_pos_probe(struct platform_device *pdev)
> +{
> + struct edac_device_ctl_info *edac_dev;
> + struct al_pos_edac *al_pos;
> + int ret;
> +
> + edac_dev = edac_device_alloc_ctl_info(sizeof(*al_pos), DRV_NAME, 1,
> + DRV_NAME, 1, 0, NULL, 0,
> + edac_device_alloc_index());
> + if (!edac_dev)
> + return -ENOMEM;
> +
> + al_pos = edac_dev->pvt_info;
> + al_pos->edac_dev = edac_dev;
> + platform_set_drvdata(pdev, al_pos);
> +
> + al_pos->mmio_base = devm_platform_ioremap_resource(pdev, 0);
> + if (IS_ERR(al_pos->mmio_base)) {
> + dev_err(&pdev->dev, "failed to ioremap memory (%ld)\n",
> + PTR_ERR(al_pos->mmio_base));

edac_device_free_ctl_info(al_pos->edac_dev) or goto err_free_edac ?

> + return PTR_ERR(al_pos->mmio_base);
> + }
> +
> + al_pos->irq = platform_get_irq(pdev, 0);
> + if (al_pos->irq <= 0)
> + edac_dev->edac_check = al_pos_edac_check;
> +
> + edac_dev->dev = &pdev->dev;
> + edac_dev->mod_name = DRV_NAME;
> + edac_dev->dev_name = dev_name(&pdev->dev);
> + edac_dev->ctl_name = "POS";

Does this show up in sysfs? The 'AL_' prefix may make it easier to find the corresponding
driver. (The TLA space is a little crowded!)


> + ret = edac_device_add_device(edac_dev);
> + if (ret) {
> + dev_err(&pdev->dev, "Failed to add edac device\n");
> + goto err_free_edac;
> + }
> +
> + if (al_pos->irq > 0) {
> + ret = devm_request_irq(&pdev->dev,
> + al_pos->irq,
> + al_pos_irq_handler,

> + 0,

Can this be IRQF_SHARED? This lets other devices register the interrupt too, which is
easily allowed if you can identify whether your device has triggered the interrupt. (which
you are already doing with the valid bit in your log1 register).


> + pdev->name,
> + pdev);
> + if (ret != 0) {
> + dev_err(&pdev->dev,
> + "failed to register to irq %d (%d)\n",
> + al_pos->irq, ret);
> + goto err_remove_edac;
> + }
> + }
> +
> + return 0;
> +
> +err_remove_edac:
> + edac_device_del_device(edac_dev->dev);
> +err_free_edac:
> + edac_device_free_ctl_info(edac_dev);
> +
> + return ret;
> +}


With the edac_dev-leak fixed and the -/_ in MAINTAINERS:

Reviewed-by: James Morse <james.morse@xxxxxxx>


Thanks,

James