RE: [RFC PATCH v7 12/12] memory: RAS2: Add memory RAS2 driver

From: Shiju Jose
Date: Wed Apr 03 2024 - 10:04:14 EST


Hi Daniel,

>-----Original Message-----
>From: Daniel Ferguson <danielf@xxxxxxxxxxxxxxxxxxxxxx>
>Sent: 28 March 2024 23:42
>To: Shiju Jose <shiju.jose@xxxxxxxxxx>; linux-cxl@xxxxxxxxxxxxxxx; linux-
>acpi@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; dan.j.williams@xxxxxxxxx;
>dave@xxxxxxxxxxxx; Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>;
>dave.jiang@xxxxxxxxx; alison.schofield@xxxxxxxxx; vishal.l.verma@xxxxxxxxx;
>ira.weiny@xxxxxxxxx
>Cc: linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
>david@xxxxxxxxxx; Vilas.Sridharan@xxxxxxx; leo.duran@xxxxxxx;
>Yazen.Ghannam@xxxxxxx; rientjes@xxxxxxxxxx; jiaqiyan@xxxxxxxxxx;
>tony.luck@xxxxxxxxx; Jon.Grimm@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx;
>rafael@xxxxxxxxxx; lenb@xxxxxxxxxx; naoya.horiguchi@xxxxxxx;
>james.morse@xxxxxxx; jthoughton@xxxxxxxxxx; somasundaram.a@xxxxxxx;
>erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx; duenwen@xxxxxxxxxx;
>mike.malvestuto@xxxxxxxxx; gthelen@xxxxxxxxxx;
>wschwartz@xxxxxxxxxxxxxxxxxxx; dferguson@xxxxxxxxxxxxxxxxxxx;
>tanxiaofei <tanxiaofei@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
>kangkang.shen@xxxxxxxxxxxxx; wanghuiqiang <wanghuiqiang@xxxxxxxxxx>;
>Linuxarm <linuxarm@xxxxxxxxxx>; wbs@xxxxxxxxxxxxxxxxxxxxxx
>Subject: Re: [RFC PATCH v7 12/12] memory: RAS2: Add memory RAS2 driver
>
>> +/*
...
>> +
>> +static int ras2_probe(struct platform_device *pdev) {
>> + int ret, id;
>> + struct mbox_client *cl;
>> + struct device *hw_scrub_dev;
>> + struct ras2_context *ras2_ctx;
>> + char scrub_name[RAS2_MAX_NAME_LENGTH];
>> +
>> + ras2_ctx = devm_kzalloc(&pdev->dev, sizeof(*ras2_ctx), GFP_KERNEL);
>> + if (!ras2_ctx)
>> + return -ENOMEM;
>> +
>> + ras2_ctx->dev = &pdev->dev;
>> + ras2_ctx->ops = &ras2_hw_ops;
>> + spin_lock_init(&ras2_ctx->spinlock);
>> + platform_set_drvdata(pdev, ras2_ctx);
>> +
>> + cl = &ras2_ctx->mbox_client;
>> + /* Request mailbox channel */
>> + cl->dev = &pdev->dev;
>> + cl->tx_done = ras2_tx_done;
>> + cl->knows_txdone = true;
>> + ras2_ctx->pcc_subspace_idx = *((int *)pdev->dev.platform_data);
>> + dev_dbg(&pdev->dev, "pcc-subspace-id=%d\n", ras2_ctx-
>>pcc_subspace_idx);
>> + ret = ras2_register_pcc_channel(ras2_ctx);
>
>In our enabling activities, we have found a challenge here.
>Our hardware has a single PCC channel corresponding to a single platform-wide
>scrub interface. This driver, following the ACPI spec, will create a new scrub
>node for each NUMA node. However, for us, this means that each scrub device
>will try to map the same PCC channel, and this causes an error.

Is failing to probe cleanly is enough for your platform? i.e. put any error messages as dev_dbg()
or whichever one causes this problem.
>> + if (ret < 0)
>> + return ret;
>> +
>> + ret = devm_add_action_or_reset(&pdev->dev, devm_ras2_release,
>ras2_ctx);
>> + if (ret < 0)
>> + return ret;
>> +
>> + if (ras2_is_patrol_scrub_support(ras2_ctx)) {
>> + id = ida_alloc(&ras2_ida, GFP_KERNEL);
>> + if (id < 0)
>> + return id;
>> + ras2_ctx->id = id;
>> + snprintf(scrub_name, sizeof(scrub_name), "%s%d",
>RAS2_SCRUB, id);
>> + dev_set_name(&pdev->dev, RAS2_ID_FORMAT, id);
>> + hw_scrub_dev = devm_scrub_device_register(&pdev->dev,
>scrub_name,
>> + ras2_ctx,
>&ras2_scrub_ops,
>> + 0, NULL);
>> + if (PTR_ERR_OR_ZERO(hw_scrub_dev))
>> + return PTR_ERR_OR_ZERO(hw_scrub_dev);
>> + }
>> + ras2_ctx->scrub_dev = hw_scrub_dev;
>> +
>> + return 0;
>> +}
>
Thanks,
Shiju