Re: [PATCH] EDAC/versalnet: Report PFN and page offset for DDR errors
From: Shubhrajyoti Datta
Date: Mon Apr 27 2026 - 06:50:24 EST
On Wed, Apr 22, 2026 at 9:29 AM Srivatsa S. Bhat <srivatsa@xxxxxxxxxxxxx> wrote:
>
> On Wed, Apr 01, 2026 at 11:28:10AM +0530, Prasanna Kumar T S M wrote:
> >
> >
> > On 29-03-2026 18:14, Shubhrajyoti Datta wrote:
> > > Currently, DDRMC correctable and uncorrectable error events are reported
> > > to EDAC with page frame number (pfn) and offset set to zero.
> > > This information is not useful to locate the address for memory errors.
> > >
> > > Compute the physical address from the error information and extract
> > > the page frame number and offset before calling edac_mc_handle_error().
> > > This provides the actual memory location information to the userspace.
> > >
> > > Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xxxxxxx>
> > > ---
> > >
>
> [...]
>
> > > if (stat->error_type == MC5_ERR_TYPE_UE) {
> > > pinf = stat->ueinfo[stat->channel];
> > > + pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> > > + pfn = PHYS_PFN(pa);
> > > snprintf(priv->message, sizeof(priv->message),
> > > "Error type:%s controller %d Addr at %lx\n",
> > > - "UE", ctl_num, convert_to_physical(priv, pinf, ctl_num, error_data));
> > > + "UE", ctl_num, pa);
> > > edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> > > - 1, 0, 0, 0, 0, 0, -1,
> > > + 1, pfn, offset_in_page(pa), 0, 0, 0, -1,
> > > priv->message, "");
> > > - pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> > > - pfn = PHYS_PFN(pa);
> > > -
> > > if (IS_ENABLED(CONFIG_MEMORY_FAILURE)) {
> > > err = memory_failure(pfn, MF_ACTION_REQUIRED);
> > > if (err)
> >
> > Nit: pa and pfn calculation can be moved out of the if() condition.
> >
>
> Hi Shubrajyoti,
>
> Could you revise this patch with a similar cleanup for the versalnet
> driver as you did for the versal driver to avoid code duplication,
> please?
> https://lore.kernel.org/all/20260415060239.733200-1-shubhrajyoti.datta@xxxxxxx/#t
Let me now if the below looks fine
if (stat->error_type == MC5_ERR_TYPE_CE) {
pinf = stat->ceinfo[stat->channel];
type = HW_EVENT_ERR_CORRECTED;
}
if (stat->error_type == MC5_ERR_TYPE_UE) {
pinf = stat->ueinfo[stat->channel];
type = HW_EVENT_ERR_UNCORRECTED;
}
pa = convert_to_physical(priv, pinf, ctl_num, error_data);
pfn = PHYS_PFN(pa);
snprintf(priv->message, sizeof(priv->message),
"Error type:%s Controller %d Addr at %lx\n",
type == HW_EVENT_ERR_UNCORRECTED ? "UE" : "CE",
ctl_num, pa);
edac_mc_handle_error(type, mci,
1, pfn, pa & ~PAGE_MASK, 0, 0, 0, -1,
priv->message, "");
if (stat->error_type == MC5_ERR_TYPE_UE) {
if (IS_ENABLED(CONFIG_MEMORY_FAILURE)) {
err = memory_failure(pfn, MF_ACTION_REQUIRED);
if (err)
edac_dbg(2, "memory_failure() error: %d", err);
else
edac_dbg(2, "Poison page at PA 0x%lx\n", pa);
}
}
>
> Thank you!
>
> Regards,
> Srivatsa
> Microsoft Linux Systems Group