Re: [PATCH] iommufd: Enforce IOMMU_RESV_SW_MSI upon hwpt_paging allocation

From: Nicolin Chen
Date: Wed Jul 31 2024 - 14:13:37 EST


On Wed, Jul 31, 2024 at 07:45:46AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > Sent: Monday, July 29, 2024 7:51 AM
> >
> > IOMMU_RESV_SW_MSI is a unique region defined by an IOMMU driver.
> > Though it
> > is eventually used by a device for address translation to an MSI location
> > (including nested cases), practically it is a universal region across all
> > domains allocated for the IOMMU that defines it.
> >
> > Currently IOMMUFD core fetches and reserves the region during an attach to
> > an hwpt_paging. It works with a hwpt_paging-only case, but might not work
> > with a nested case where a device could directly attach to a hwpt_nested,
> > bypassing the hwpt_paging attachment.
>
> This probably needs a bit more context. IIUC it's the ARM-side choice
> that instead of letting VMM emulate a vITS for S1 and then map it to
> physical ITS range in S2 it relies on the kernel to continue the msi
> cookie reservation in S2 and then expects the guest to identity map
> it in S1.
>
> With that context if a device is directly attached to a hwpt_nested,
> hwpt_paging attachment is bypassed including the msi doorbell
> setup on the parent S2 then it's broken.

Yes. That's exactly the issue. My bad that made it simplified.

> > @@ -364,7 +305,8 @@ int iommufd_hw_pagetable_attach(struct
> > iommufd_hw_pagetable *hwpt,
> > }
> >
> > if (hwpt_is_paging(hwpt)) {
> > - rc = iommufd_hwpt_paging_attach(to_hwpt_paging(hwpt),
> > idev);
> > + rc = iopt_table_enforce_dev_resv_regions(
> > + &to_hwpt_paging(hwpt)->ioas->iopt, idev-
> > >dev);
>
> Is it simpler to extend the original operation to the parent S2 when
> it's hwpt_nested?

Likely. I recall that was what one of our WIP versions did.

> The name iommufd_hwpt_paging_attach() is a bit misleading. The
> actual work there is all about reservations. It doesn't change any
> tracking structure about attachment between device and hwpt.

How about iommufd_hwpt_enforce/remove_rr() taking hwpt v.s.
hwpt_paging.

> The only downside is unnecessarily reserved regions of dev1
> (attached to hwpt_nested) added to S2 which might be directly
> attached only by dev2 so the available ranges for dev2 are
> unnecessarily shrunk.
>
> but I'm not sure that would be a real problem in practice, given
> 1) there is no usage using up closely the entire IOVA space yet,
> 2) guest may change the viommu mode to switch between nested
> and paging then VMM has to take all devices' reserved regions
> into consideration anyway, when composing the GPA space.

That sounds reasonable to me.

> With that I think continuing this per-device reservation scheme is
> easier than adding specific reservation for SW_MSI at hwpt creation
> time and then further requiring check at attach time to verify
> the attached device is allocated with the same address as the one
> during allocation.

Jason, do you agree?

Thanks
Nic