Re: [RESEND PATCH v8 00/11] Fix BUG_ON in vfio_iommu_group_notifier()
From: Jason Gunthorpe
Date: Tue May 03 2022 - 11:23:48 EST
On Tue, May 03, 2022 at 02:04:37PM +0100, Robin Murphy wrote:
> > I'm guessing SMMU3 needs to call it's arm_smmu_detach_dev(master) from
> > the detach_dev op and null it's cached copy of the domain, but I don't
> > know this driver.. Robin?
>
> The original intent was that .detach_dev is deprecated in favour of default
> domains, and when the latter are in use, a device is always attached
> *somewhere* once probed (i.e. group->domain is never NULL). At face value,
> the neatest fix IMO would probably be for SMMUv3's .domain_free to handle
> smmu_domain->devices being non-empty and detach them at that point. However
> that wouldn't be viable for virtio-iommu or anyone else keeping an internal
> one-way association of devices to their current domains.
Oh wow that is not obvious
Actually, I think it is much worse than this because
iommu_group_claim_dma_owner() does a __iommu_detach_group() with the
expecation that this would actually result in DMA being blocked,
immediately. The idea that __iomuu_detatch_group() is a NOP is kind of
scary.
Leaving the group attached to the kernel DMA domain will allow
userspace to DMA to all kernel memory :\
So one approach could be to block use of iommu_group_claim_dma_owner()
if no detatch_dev op is present and then go through and put them back
or do something else. This could be short-term OK if we add an op to
SMMUv3, but long term everything would have to be fixed
Or we can allocate a dummy empty/blocked domain during
iommu_group_claim_dma_owner() and attach it whenever.
The really ugly trick is that detatch cannot fail, so attach to this
blocking domain must also not fail - IMHO this is a very complicated
API to expect for the driver to implement correctly... I see there is
already a WARN_ON that attaching to the default domain cannot
fail. Maybe this warrants an actual no-fail attach op so the driver
can be more aware of this..
And some of these internal APIs could stand some adjusting if we
really never want a true "detatch" it is always some kind of
replace/swap type operation, either to the default domain or to the
blocking domain.
> We *could* stay true to the original paradigm by introducing some real usage
> of IOMMU_DOMAIN_BLOCKED, such that we could keep one or more of those around
> to actively attach to instead of having groups in this unattached limbo
> state, but that's a bigger job involving adding support to drivers as well;
> too much for a quick fix now...
I suspect for the short term we can get by with an empty mapping
domain - using DOMAIN_BLOCKED is a bit of a refinement.
Thanks,
Jason