Re: [PATCH v5 0/3] mm: Implement ECC handling for pfn with no struct page

From: Jiaqi Yan

Date: Tue Jan 20 2026 - 11:28:52 EST


On Fri, Jan 16, 2026 at 9:36 PM Ankit Agrawal <ankita@xxxxxxxxxx> wrote:
>
> >>
> >> v2 -> v3
> >> - Rebased to v6.17-rc7.
> >> - Skipped the unmapping of PFNMAP during reception of poison. Suggested by
> >> Jason Gunthorpe, Jiaqi Yan, Vikram Sethi (Thanks!)
> >> - Updated the check to prevent multiple registration to the same PFN
> >> range using interval_tree_iter_first. Thanks Shameer Kolothum for the
> >> suggestion.
> >> - Removed the callback function in the nvgrace-gpu requiring tracking of
> >> poisoned PFN as it isn't required anymore.
> >
> > Hi Ankit,
> >
> >
> > I get that for nvgrace-gpu driver, you removed pfn_address_space_ops
> > because there is no need to unmap poisoned HBM page.
> >
> > What about the nvgrace-egm driver? Now that you removed the
> > pfn_address_space_ops callback from pfn_address_space in [1], how can
> > nvgrace-egm driver know the poisoned EGM pages at runtime?
> >
> > I expect the functionality to return retired pages should also include
> > runtime poisoned pages, which are not in the list queried from
> > egm-retired-pages-data-base during initialization. Or maybe my
> > expection is wrong/obsolete?
>
> Hi Jiaqi, yes the EGM code will include consideration for runtime
> poisoned pages as well. It will now instead make use of the
> pfn_to_vma_pgoff callback merged through https://github.com/torvalds/linux/commit/e6dbcb7c0e7b508d443a9aa6f77f63a2f83b1ae4

Thank you! Sorry I wasn't following that thread closely and missed it.

>
> > [1] https://lore.kernel.org/linux-mm/20230920140210.12663-2-ankita@xxxxxxxxxx
> > [2] https://lore.kernel.org/kvm/20250904040828.319452-12-ankita@xxxxxxxxxx
>