Re: [PATCH 2/3] iommu/vt-d: Clear Present bit before tearing down PASID entry

From: Baolu Lu

Date: Fri Jan 16 2026 - 01:06:30 EST

On 1/16/26 05:35, Dmytro Maluka wrote:

On Thu, Jan 15, 2026 at 10:45:12AM +0800, Baolu Lu wrote:

On 1/14/26 19:12, Dmytro Maluka wrote:

On Wed, Jan 14, 2026 at 01:38:13PM +0800, Baolu Lu wrote:

On 1/14/26 03:34, Dmytro Maluka wrote:

On Tue, Jan 13, 2026 at 11:00:47AM +0800, Lu Baolu wrote:

+ intel_pasid_clear_entry(iommu, dev, pasid, fault_ignore);

Is it safe to do this with iommu->lock already unlocked?

Yes, it is. The PASID entry lifecycle is serialized by the iommu_group-

mutex in the iommu core, which ensures that no other thread can attempt

to allocate or setup this same PASID until intel_pasid_tear_down_entry()
has returned.

The iommu->lock is held during the initial transition (P->0) to ensure
atomicity against other hardware-table walkers, but once the P bit is
cleared and the caches are flushed, the final zeroing of the 'dead'
entry does not strictly require the spinlock because the PASID remains
reserved in software until the function completes.

Ok. Just to understand: "other hardware-table walkers" means some
software walkers, not hardware ones? Which software walkers are those?
(I can't imagine how holding a spinlock could prevent the hardware from
walking those tables. :))

You are right. A spinlock doesn't stop the hardware. The spinlock
serializes software threads to ensure the hardware walker always sees a
consistent entry.

When a PASID entry is active (P=1), other kernel paths might modify
the control bits in-place. For example:

void intel_pasid_setup_page_snoop_control(struct intel_iommu *iommu,
struct device *dev, u32 pasid)
{
struct pasid_entry *pte;
u16 did;

spin_lock(&iommu->lock);
pte = intel_pasid_get_entry(dev, pasid);
if (WARN_ON(!pte || !pasid_pte_is_present(pte))) {
spin_unlock(&iommu->lock);
return;
}

pasid_set_pgsnp(pte);
did = pasid_get_domain_id(pte);
spin_unlock(&iommu->lock);

intel_pasid_flush_present(iommu, dev, pasid, did, pte);
}

In this case, the iommu->lock ensures that if two threads try to modify
the same active entry, they don't interfere with each other and leave
the entry in a 'torn' state for the IOMMU hardware to read.

In intel_pasid_tear_down_entry(), once the PASID entry is deactivated
(setting P=0 and flushing caches), the entry is owned exclusively by
the teardown thread until it is re-configured. That's the reason why the
final zeroing doesn't need the spinlock.

I see. Am I correct that those other code paths (modifying an entry
in-place) are not supposed to do that concurrently with
intel_pasid_tear_down_entry(), i.e. they should only do that while it is
guaranteed that the entry remains present? Otherwise there is a bug
(hence, for example, the WARN_ON in
intel_pasid_setup_page_snoop_control())?

The iommu driver assumes that high-level software should ensure this.

So, holding iommu->lock during
entry teardown is not strictly necessary (i.e. we could unlock it even
earlier than setting P=0), i.e. holding the lock until the entry is
deactivated is basically just a safety measure for possible buggy code?

There are other paths that may be concurrent, such as the debugfs path
(dumping the pasid table through debugfs). Therefore, keeping iommu-
>lock in the driver is neither redundant nor buggy.

Thanks,
baolu