Re: [PATCH 05/14] iommupt: Implement preserve/unpreserve/restore callbacks
From: Samiullah Khawaja
Date: Mon Apr 13 2026 - 15:31:54 EST
On Fri, Apr 10, 2026 at 08:16:50PM -0300, Jason Gunthorpe wrote:
On Fri, Apr 10, 2026 at 11:02:52PM +0000, Samiullah Khawaja wrote:
On Fri, Apr 10, 2026 at 11:16:52AM -0300, Jason Gunthorpe wrote:
> On Fri, Mar 20, 2026 at 09:57:08PM +0000, Pranjal Shrivastava wrote:
> > > +static int __restore_tables(struct pt_range *range, void *arg,
> > > + unsigned int level, struct pt_table_p *table)
> > > +{
> > > + struct pt_state pts = pt_init(range, level, table);
> > > + int ret;
> > > +
> > > + for_each_pt_level_entry(&pts) {
> > > + if (pts.type == PT_ENTRY_TABLE) {
> > > + iommu_restore_page(virt_to_phys(pts.table_lower));
> > > + ret = pt_descend(&pts, arg, __restore_tables);
> > > + if (ret)
> > > + return ret;
> >
> > If pt_descend() returns an error, we immediately return ret. However, we
> > have already successfully called iommu_restore_page() on pts.table_lower
> > and potentially many other tables earlier in the loop or higher up in
> > the tree..
>
> It doesn't return an error, it just propogates errors from the
> callbacks which this one never errors. So this is just dead code.
>
> > > +int DOMAIN_NS(restore)(struct iommu_domain *domain, struct iommu_domain_ser *ser)
> > > +{
> > > + struct pt_iommu *iommu_table =
> > > + container_of(domain, struct pt_iommu, domain);
> > > + struct pt_common *common = common_from_iommu(iommu_table);
> > > + struct pt_range range = pt_all_range(common);
> > > +
> > > + iommu_restore_page(ser->top_table);
> > > +
> > > + /* Free new table */
> > > + iommu_free_pages(range.top_table);
> > > +
> > > + /* Set the restored top table */
> > > + pt_top_set(common, phys_to_virt(ser->top_table), ser->top_level);
> > > +
> > > + /* Restore all pages*/
> > > + range = pt_all_range(common);
> > > + return pt_walk_range(&range, __restore_tables, NULL);
>
> This should probably be doing something with the FEAT flags and
> ias/oas too or do you imagine the calling driver has to deal with
> that?
During boot the iommu_domain is recreated in driver and it sets up the
FEAT flags and ias/oas properly. Then this generic callback is used to
restore the page tables.
Currently the FEAT flags of a domain are not explicitly preserved, I
will preserve them and error out here if there is a mismatch.
Hrm, that expands the ABI a bit though
If the only operation on the restored table is free then I suppose it
can be simplified quite a bit, you just need the minimal things that
the collect walker in free touches.
Yes, we use the collect walker during KHO restore of the preserved pages
and also during free. But if I understand correctly, the collect walker
behaviour changes based on some FEAT_ flags (like SIGN_EXTEND). So we
have to be careful if the previous kernel was using different FEAT flags
that affect the collect walker. To handle this, we can just preserve the
u32 features from struct pt_common and deduce everything using that.
Or are you suggesting not to save u32 features at all?
Thinking about it more, we do preserve the top_level, so that could
potentially be used to walk over the page tables of these free-only
domains if we just set up the pts->index and pts->end_index properly by
initializing the range based on the top_level. Are you thinking of a
similar approach to walk these free-only domains?
Is that the intention, free only?
Yes, the intention is to free only. This domain will be immutable and can
only be freed.
If so then the restored iommu_domain should be some special free only
domain too.
Agreed.
Jason