Re: [PATCH v2 1/6] iommu/vt-d: Setup scalable mode context entry in probe path

From: Baolu Lu
Date: Sat Dec 09 2023 - 02:58:21 EST


On 12/8/23 4:50 PM, Tian, Kevin wrote:
From: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
Sent: Tuesday, December 5, 2023 9:22 AM

@@ -304,6 +304,11 @@ int intel_pasid_setup_first_level(struct intel_iommu
*iommu,
return -EINVAL;
}

+ if (intel_pasid_setup_sm_context(dev, true)) {
+ dev_err(dev, "Context entry is not configured\n");
+ return -ENODEV;
+ }
+
spin_lock(&iommu->lock);
pte = intel_pasid_get_entry(dev, pasid);
if (!pte) {
@@ -384,6 +389,11 @@ int intel_pasid_setup_second_level(struct
intel_iommu *iommu,
return -EINVAL;
}

+ if (intel_pasid_setup_sm_context(dev, true)) {
+ dev_err(dev, "Context entry is not configured\n");
+ return -ENODEV;
+ }
+
pgd = domain->pgd;
agaw = iommu_skip_agaw(domain, iommu, &pgd);
if (agaw < 0) {
@@ -505,6 +515,11 @@ int intel_pasid_setup_pass_through(struct
intel_iommu *iommu,
u16 did = FLPT_DEFAULT_DID;
struct pasid_entry *pte;

+ if (intel_pasid_setup_sm_context(dev, true)) {
+ dev_err(dev, "Context entry is not configured\n");
+ return -ENODEV;
+ }
+
spin_lock(&iommu->lock);
pte = intel_pasid_get_entry(dev, pasid);
if (!pte) {

instead of replicating the invocation in all three stubs it's simpler to
do once in dmar_domain_attach_device() for all of them.

It's not good to repeat the code. Perhaps we can add this check to
intel_pasid_get_entry()? The rule is that you can't get the pasid entry
if the context is copied.

Then put the deferred check outside of intel_pasid_setup_sm_context()
instead of using a Boolean flag

Okay, that's more readable.

@@ -623,6 +638,11 @@ int intel_pasid_setup_nested(struct intel_iommu
*iommu, struct device *dev,
return -EINVAL;
}

+ if (intel_pasid_setup_sm_context(dev, true)) {
+ dev_err_ratelimited(dev, "Context entry is not configured\n");
+ return -ENODEV;
+ }
+

Do we support nested in kdump?

No.


+
+ /*
+ * Cache invalidation for changes to a scalable-mode context table
+ * entry.
+ *
+ * Section 6.5.3.3 of the VT-d spec:
+ * - Device-selective context-cache invalidation;
+ * - Domain-selective PASID-cache invalidation to affected domains
+ * (can be skipped if all PASID entries were not-present);
+ * - Domain-selective IOTLB invalidation to affected domains;
+ * - Global Device-TLB invalidation to affected functions.
+ *
+ * For kdump cases, old valid entries may be cached due to the
+ * in-flight DMA and copied pgtable, but there is no unmapping
+ * behaviour for them, thus we need explicit cache flushes for all
+ * affected domain IDs and PASIDs used in the copied PASID table.
+ * Given that we have no idea about which domain IDs and PASIDs
were
+ * used in the copied tables, upgrade them to global PASID and IOTLB
+ * cache invalidation.
+ *
+ * For kdump case, at this point, the device is supposed to finish
+ * reset at its driver probe stage, so no in-flight DMA will exist,
+ * and we don't need to worry anymore hereafter.
+ */
+ if (context_copied(iommu, bus, devfn)) {
+ context_clear_entry(context);
+ clear_context_copied(iommu, bus, devfn);
+ iommu->flush.flush_context(iommu, 0,
+ (((u16)bus) << 8) | devfn,
+ DMA_CCMD_MASK_NOBIT,
+ DMA_CCMD_DEVICE_INVL);
+ qi_flush_pasid_cache(iommu, 0, QI_PC_GLOBAL, 0);
+ iommu->flush.flush_iotlb(iommu, 0, 0, 0,
DMA_TLB_GLOBAL_FLUSH);
+ devtlb_invalidation_with_pasid(iommu, dev,
IOMMU_NO_PASID);
+ }

I don't see this logic from existing code. If it's a bug fix then
please send it separately first.

This code originates from domain_context_mapping_one(). It's not a bug
fix.

+
+ context_entry_set_pasid_table(context, dev);

and here is additional change to the context entry. Why is the
context cache invalidated in the start?

The previous context entry may be copied from a previous kernel.
Therefore, we need to tear down the entry and flush the caches before
reusing it.


+
+static int pci_pasid_table_setup(struct pci_dev *pdev, u16 alias, void *data)
+{
+ struct device *dev = data;
+
+ if (dev != &pdev->dev)
+ return 0;

what is it for? the existing domain_context_mapping_cb() doesn't have
this check then implying a behavior change.

Emm, I should remove this line and keep it consistent with the exiting
code.

Best regards,
baolu