Re: [PATCH 04/14] iommu/pages: Add APIs to preserve/unpreserve/restore iommu pages
From: Samiullah Khawaja
Date: Fri Mar 20 2026 - 14:24:44 EST
On Fri, Mar 20, 2026 at 05:27:46PM +0000, Pranjal Shrivastava wrote:
On Tue, Mar 03, 2026 at 06:41:26PM +0000, Samiullah Khawaja wrote:
On Tue, Mar 03, 2026 at 04:42:02PM +0000, Ankit Soni wrote:
> On Tue, Feb 03, 2026 at 10:09:38PM +0000, Samiullah Khawaja wrote:
> > IOMMU pages are allocated/freed using APIs using struct ioptdesc. For
> > the proper preservation and restoration of ioptdesc add helper
> > functions.
> >
> > Signed-off-by: Samiullah Khawaja <skhawaja@xxxxxxxxxx>
> > ---
> > drivers/iommu/iommu-pages.c | 74 +++++++++++++++++++++++++++++++++++++
> > drivers/iommu/iommu-pages.h | 30 +++++++++++++++
> > 2 files changed, 104 insertions(+)
> >
> > diff --git a/drivers/iommu/iommu-pages.c b/drivers/iommu/iommu-pages.c
> > index 3bab175d8557..588a8f19b196 100644
> > --- a/drivers/iommu/iommu-pages.c
> > +++ b/drivers/iommu/iommu-pages.c
> > @@ -6,6 +6,7 @@
> > #include "iommu-pages.h"
> > #include <linux/dma-mapping.h>
> > #include <linux/gfp.h>
> > +#include <linux/kexec_handover.h>
> > #include <linux/mm.h>
> >
> > #define IOPTDESC_MATCH(pg_elm, elm) \
> > @@ -131,6 +132,79 @@ void iommu_put_pages_list(struct iommu_pages_list *list)
> > }
> > EXPORT_SYMBOL_GPL(iommu_put_pages_list);
> >
> > +#if IS_ENABLED(CONFIG_IOMMU_LIVEUPDATE)
> > +void iommu_unpreserve_page(void *virt)
> > +{
> > + kho_unpreserve_folio(ioptdesc_folio(virt_to_ioptdesc(virt)));
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_unpreserve_page);
> > +
> > +int iommu_preserve_page(void *virt)
> > +{
> > + return kho_preserve_folio(ioptdesc_folio(virt_to_ioptdesc(virt)));
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_preserve_page);
> > +
> > +void iommu_unpreserve_pages(struct iommu_pages_list *list, int count)
> > +{
> > + struct ioptdesc *iopt;
> > +
> > + if (!count)
> > + return;
> > +
> > + /* If less than zero then unpreserve all pages. */
> > + if (count < 0)
> > + count = 0;
> > +
> > + list_for_each_entry(iopt, &list->pages, iopt_freelist_elm) {
> > + kho_unpreserve_folio(ioptdesc_folio(iopt));
> > + if (count > 0 && --count == 0)
> > + break;
> > + }
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_unpreserve_pages);
> > +
> > +void iommu_restore_page(u64 phys)
> > +{
> > + struct ioptdesc *iopt;
> > + struct folio *folio;
> > + unsigned long pgcnt;
> > + unsigned int order;
> > +
> > + folio = kho_restore_folio(phys);
> > + BUG_ON(!folio);
> > +
> > + iopt = folio_ioptdesc(folio);
>
> iopt->incoherent = false; should be here?
>
Yes this should be set here. I will update this.
I'm wondering if we are silently losing state here. What if the
preserved page was actually incoherent in the previous kernel?
I understand we likely need to initialize it to false here because we
don't have a dev pointer for DMA sync operations at this low level (though
x86 uses clflush).
But when is it set back to "incoherent" again? I don't see that
happening during the driver re-attach phase?
This can be done during restore_domain as the domain has a reference to
the dev when it is recreated. I will updated the walker and add this in
the next revision.
Should we at least mention that this API intentionally overwrites the
preserved coherency state and that these pages must explicitly be marked
incoherent again later by the driver based on its preserved HW state OR
by the IOMMUFD re-attach?
Agreed. I will add a Note about this.
> > +
> > + order = folio_order(folio);
> > + pgcnt = 1UL << order;
> > + mod_node_page_state(folio_pgdat(folio), NR_IOMMU_PAGES, pgcnt);
> > + lruvec_stat_mod_folio(folio, NR_SECONDARY_PAGETABLE, pgcnt);
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_restore_page);
[------ snip >8 -------]
Thanks,
Praan
Thanks,
Sami