RE: [PATCH v4 08/17] iommufd: Always setup MSI and anforce cc on kernel-managed domains

From: Tian, Kevin
Date: Wed Oct 18 2023 - 21:56:12 EST


> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Thursday, October 19, 2023 12:51 AM
>
> On Tue, Oct 17, 2023 at 12:58:39PM -0700, Nicolin Chen wrote:
> > On Tue, Oct 17, 2023 at 12:53:01PM -0300, Jason Gunthorpe wrote:
> > > On Tue, Oct 17, 2023 at 08:52:49AM +0000, Tian, Kevin wrote:
> > > > I prefer to removing enforce_cc in attach fn completely then no parent
> > > > trick in this patch. Just keep it in hwpt_alloc and leave to iommu driver
> to
> > > > figure out the attaching compatibility:
> > >
> > > You are basically saying to set the cc mode during creation because we
> > > know the idev at that moment and can tell if it should be on/off?
> > >
> > > It seems reasonable, but I can't remember why it is in the attach path
> > > at the moment.
> >
> > This was the commit adding it in the alloc path:
> > https://lore.kernel.org/linux-iommu/8e897628-61fa-b3fb-b609-
> 44eeda11b45e@xxxxxxx/
> >
> > The older code was doing a hwpt "upgrade" from !cc to cc:
> > - /*
> > - * Try to upgrade the domain we have, it is an iommu driver bug to
> > - * report IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail
> > - * enforce_cache_coherency when there are no devices attached to
> the
> > - * domain.
> > - */
> > - if (idev->enforce_cache_coherency && !hwpt-
> >enforce_cache_coherency) {
> > - if (hwpt->domain->ops->enforce_cache_coherency)
> > - hwpt->enforce_cache_coherency =
> > - hwpt->domain->ops->enforce_cache_coherency(
> > - hwpt->domain);
> >
> > If we remove the enforce_cc call in the attach path and let the
> > driver decide whether to enforce or reject in attach_dev calls,
> > there seems to be no point in tracking an enforce_cache_coherency
> > flag in the IOMMUFD pathway but only for the VFIO_DMA_CC_IOMMU
> > extension check in the vfio-compat pathway?
>
> I think the purpose of this code is to try to minimize the number of
> domains.
>
> Otherwise we have a problem where the order devices are attached to
> the domain decides how many domains you get. ie the first device
> attached does not want CC (but is compatible with it) so we create a
> non-CC domain

in autodetect model this won't happen. If the first device is capable
of enforce_cc then the domain will be created with enforce_cc.

there is no "does not want CC" input in autodetect.

>
> Then later we attach a device that does want CC and now we are forced
> to create a second iommu domain when upgrading the first domain would
> have been fine.

then in this case the 2nd device will reuse the domain.

>
> Hotplug is another scenario (though Intel driver does not support it,
> and it looks broken)

Can you elaborate how hotplug is broken? If device is hotplugged and
failed to attach to an existing domain, then a new one will be created
for it.

there is indeed a broken case in KVM which cannot handle dynamic
change of coherency due to hotplug. But that one is orthogonal to
what we discussed here about how to decide cc in iommufd.

>
> Really, I hate this CC mechanism. It is only for Intel, can we just

Intel and AMD.

> punt it to userspace and have an intel 'want cc flag' for the entire
> nesting path and forget about it?
>

I dislike it too. But still not get your point why adding such a flag
can really simplify things. As explained before the only difference
between autodetect and having a user flag just affects the decision
of cc when creating a hwpt. whether we should upgrade in the
attach path is an orthogonal open which imho is unnecessary and
Nicoline's patches to remove that check then also remove this
patch makes lot of sense to me.

Then the necessity of introducing such a flag (plus adding a new
interface to report cc to user space) is not obvious.