Re: [PATCH rc v7 6/6] iommu: Fix UAF in pci_dev_reset_iommu_done() due to concurrent detach

From: Nicolin Chen

Date: Tue Apr 21 2026 - 14:11:43 EST


On Tue, Apr 21, 2026 at 07:41:03AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > Sent: Sunday, April 19, 2026 7:42 AM
> >
> > In __iommu_group_set_domain_internal(), concurrent domain attachments
> > are
> > rejected when any device in the group is recovering. This is necessary to
> > fence concurrent attachments to a multi-device group where devices might
> > share the same RID due to PCI DMA alias quirks.
> >
> > However, IOMMU_SET_DOMAIN_MUST_SUCCEED callers (detach/teardown
> > paths such
> > as __iommu_group_set_core_domain and
> > __iommu_release_dma_ownership) should
> > not be rejected, as the domain would be free-ed anyway in this nofail path
> > while group->domain is still pointing to it. So pci_dev_reset_iommu_done()
> > could trigger a UAF when re-attaching group->domain.
>
> As I noted in my reply to v6, a WARN_ON will be triggered before any UAF:
>
> static void __iommu_group_set_domain_nofail(struct iommu_group *group,
> struct iommu_domain *new_domain)
> {
> WARN_ON(__iommu_group_set_domain_internal(
> group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED));
> }

OK. I think this fix should be just "do not fail MUST_SUCCEED" or so.

> > @@ -2482,6 +2485,13 @@ static int
> > __iommu_group_set_domain_internal(struct iommu_group *group,
> > */
> > result = 0;
> > for_each_group_device(group, gdev) {
> > + /*
> > + * Device under recovery is attached to group-
> > >blocking_domain.
> > + * Don't change that. pci_dev_reset_iommu_done() will re-
> > attach
> > + * its domain to the updated group->domain, after the
> > recovery.
> > + */
> > + if (gdev->blocked)
> > + continue;
>
> This reminds me one thing. Ideally the blocked device really doesn't care
> whether group->domain is the one before resetting or a different one
> changed in middle. It's blocked then doesn't refer to any non-blocking
> domains. After reset is done it's re-attached to whatever group->domain
> is at that time.
>
> Then sounds there is no reason to block attach/replace too. Just skip
> the blocked device and update group->domain then it will be picked up
> later at reset done, just like done here for detach.

There might be devices in the same group sharing RID?

> Sashiko [1] gave another interesting comment about dma aliasing caused
> by PCIe to PCI/PCI-X bridge - devices behind the bridge share a same
> RID (then same device/context entry in IOMMU). In this case unblocking
> devA could prematurely unblock devB which is actively undergoing a reset.

Exactly. I recall we talked about it when introducing this entire
reset piece: there was a piece of condition in the reset helpers
skipping aliasing groups, then we dropped it to focus on singleton
groups for the first version. Maybe we can resume the discussion,
but I doubt we could exclude those RID sharing cases...

Nicolin