Re: [PATCH 1/1] iommu/vt-d: Fix missed device TLB cache tag

From: Baolu Lu
Date: Wed Jun 19 2024 - 23:16:20 EST


On 6/20/24 11:04 AM, Tian, Kevin wrote:
From: Baolu Lu<baolu.lu@xxxxxxxxxxxxxxx>
Sent: Thursday, June 20, 2024 8:50 AM

On 6/20/24 12:46 AM, Jason Gunthorpe wrote:
On Wed, Jun 19, 2024 at 09:53:45AM +0800, Lu Baolu wrote:
When a domain is attached to a device, the required cache tags are
assigned to the domain so that the related caches could be flushed
whenever it is needed. The device TLB cache tag is created selectively
by checking the ats_enabled field of the device's iommu data. This
creates an ordered dependency between attach and ATS enabling paths.

The device TLB cache tag will not be created if device's ATS is enabled
after the domain attachment. This causes some devices, for example
intel_vpu, to malfunction.
What? How is this even possible?

ATS is controlled exclusively by the iommu driver, how can it be
changed without the driver knowing??
Yes. ATS is currently controlled exclusively by the iommu driver. The
intel iommu driver enables PCI/ATS on the probe path after the default
domain is attached. That means when the default domain is attached to
the device, the ats_supported is set, but ats_enabled is cleared. So the
cache tag for the device TLB won't be created.
I don't quite get why this is specific to the probe path and the default
domain.

The issue is with the domain attaching device path, not specific to the
probe or default domain.


dmar_domain_attach_device()
{
cache_tag_assign_domain();
//setup pasid entry for pt/1st/2nd
iommu_enable_pci_caps();
}

seems that for all domain attaches above is coded in a wrong order
as ats is enabled after the cache tag is assigned.

Yes, exactly. But simply changing the order isn't future-proof,
considering ATS control will eventually be moved out of iommu drivers.

why is it considered
to affect only some devices e.g. intel_vpu?

This bug was discovered during testing of the intel_vpu device and
affects devices other than the intel_vpu. The commit message is a bit
confusing. :-)

Best regards,
baolu