Re: [PATCH 11/14] iommufd-lu: Persist iommu hardware pagetables for live update

From: Samiullah Khawaja

Date: Mon Mar 23 2026 - 17:36:02 EST


On Mon, Mar 23, 2026 at 01:28:13PM -0700, Vipin Sharma wrote:
On Tue, Feb 03, 2026 at 10:09:45PM +0000, Samiullah Khawaja wrote:
From: YiFei Zhu <zhuyifei@xxxxxxxxxx>

The caller is expected to mark each HWPT to be preserved with an ioctl
call, with a token that will be used in restore. At preserve time, each
HWPT's domain is then called with iommu_domain_preserve to preserve the

Nit: Add () after iommu_domain_preserve, it makes easier to know this is
a function.

Will do.

diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
+static int iommufd_save_hwpts(struct iommufd_ctx *ictx,
+ struct iommufd_lu *iommufd_lu,
+ struct liveupdate_session *session)
+{
+ struct iommufd_hwpt_paging *hwpt, **hwpts = NULL;

We should rename hwpts to something to denote that it is a list.

Agreed. Will do.

+ struct iommu_domain_ser *domain_ser;
+ struct iommufd_hwpt_lu *hwpt_lu;
+ struct iommufd_object *obj;
+ unsigned int nr_hwpts = 0;
+ unsigned long index;
+ unsigned int i;
+ int rc = 0;
+
+ if (iommufd_lu) {
+ hwpts = kcalloc(iommufd_lu->nr_hwpts, sizeof(*hwpts),
+ GFP_KERNEL);
+ if (!hwpts)
+ return -ENOMEM;
+ }
+
+ xa_lock(&ictx->objects);
+ xa_for_each(&ictx->objects, index, obj) {
+ if (obj->type != IOMMUFD_OBJ_HWPT_PAGING)
+ continue;
+
+ hwpt = container_of(obj, struct iommufd_hwpt_paging, common.obj);
+ if (!hwpt->lu_preserve)
+ continue;
+
+ if (hwpt->ioas) {
+ /*
+ * Obtain exclusive access to the IOAS and IOPT while we
+ * set immutability
+ */
+ mutex_lock(&hwpt->ioas->mutex);
+ down_write(&hwpt->ioas->iopt.domains_rwsem);
+ down_write(&hwpt->ioas->iopt.iova_rwsem);
+
+ hwpt->ioas->iopt.lu_map_immutable = true;

It feels odd that this is not cleaned up in this function as a part of
its error handling. Now it becomes caller resposibility to handle clean
up for the side effect created by this function.

The immutablity is set early during preservation and it might be
required to unset it in the caller also when error occurs. I have to
rework this to not acquire a mutex lock in a spin lock, so I think this
part can also be reworked by doing it in following steps,

- Alloc memory for preservation.
- Mark immutable.
- Preserve using IOMMU core APIs.

IMO, this function should clean up lu_map_immutable instread of callers
to make sure they call cleanups. You can also try exploring XA_MARK_*
APIs if that can help in cleaning up easily.

Agreed.

+int iommufd_liveupdate_unregister_lufs(void)
+{
+ WARN_ON(iommu_liveupdate_unregister_flb(&iommufd_lu_handler));
+
+ return liveupdate_unregister_file_handler(&iommufd_lu_handler);

This seems little insconsistent, iommu_liveupdate_unregister_flb() has
WARN_ON and liveupdate_unregister_file_handler() does not.

The error from unregister_file_handler() is propagated to the caller,
but the error from unregister_flb() is not propagated as need to call
the unregister_file_handler().

Also, refer https://lore.kernel.org/all/20260226160353.6f3371bc@xxxxxxxxxxx/

Yes. This will need updates according to the new changes in LUO where
the unregister functions has void return type.

diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
@@ -775,11 +775,21 @@ static int __init iommufd_init(void)
if (ret)
goto err_misc;
}
+
+ if (IS_ENABLED(CONFIG_IOMMU_LIVEUPDATE)) {
+ ret = iommufd_liveupdate_register_lufs();
+ if (ret)
+ goto err_vfio_misc;
+ }
+

Do we need IS_ENABLED here? iommufd_private.h is providing definition
for both values of CONFIG_IOMMU_LIVEUPDATE.

Nit: What is lufs? Should we just rename to iommufd_liveupdate_register()?

Will update both.

ret = iommufd_test_init();
if (ret)
- goto err_vfio_misc;
+ goto err_lufs;
return 0;

+err_lufs:
+ if (IS_ENABLED(CONFIG_IOMMU_LIVEUPDATE))
+ iommufd_liveupdate_unregister_lufs();
err_vfio_misc:
if (IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER))
misc_deregister(&vfio_misc_dev);
@@ -791,6 +801,8 @@ static int __init iommufd_init(void)
static void __exit iommufd_exit(void)
{
iommufd_test_exit();
+ if (IS_ENABLED(CONFIG_IOMMU_LIVEUPDATE))
+ iommufd_liveupdate_unregister_lufs();

Same as above.

Will update.

diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
@@ -1427,6 +1429,12 @@ struct iopt_pages *iopt_alloc_file_pages(struct file *file,
pages->file = get_file(file);
pages->start = start - start_byte;
pages->type = IOPT_ADDRESS_FILE;
+
+ pages->seals = 0;

This is already 0.

Will update.

+ seals = memfd_get_seals(file);
+ if (seals > 0)
+ pages->seals = seals;
+
return pages;
}

--- /dev/null
+++ b/include/linux/kho/abi/iommufd.h
+
+struct iommufd_lu {
+ unsigned int nr_hwpts;

Should this be u32 or u64?

+ struct iommufd_hwpt_lu hwpts[];
+};
+

Should this also be __packed?

Will do both u32 and packed.

+#endif /* _LINUX_KHO_ABI_IOMMUFD_H */
--
2.53.0.rc2.204.g2597b5adb4-goog