RE: [PATCH v3 04/10] iommu/vt-d: Add helper to setup pasid nested translation

From: Tian, Kevin
Date: Wed May 24 2023 - 03:17:54 EST


> From: Yi Liu <yi.l.liu@xxxxxxxxx>
> Sent: Thursday, May 11, 2023 10:51 PM
>
> +
> +/**
> + * intel_pasid_setup_nested() - Set up PASID entry for nested translation.
> + * This could be used for guest shared virtual address. In this case, the
> + * first level page tables are used for GVA-GPA translation in the guest,
> + * second level page tables are used for GPA-HPA translation.

it's not just for guest SVA. Actually in this series it's RID_PASID nested
translation.

> + *
> + * @iommu: IOMMU which the device belong to
> + * @dev: Device to be set up for translation
> + * @pasid: PASID to be programmed in the device PASID table
> + * @domain: User domain nested on a s2 domain

"User stage-1 domain"

> + */
> +int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device
> *dev,
> + u32 pasid, struct dmar_domain *domain)
> +{
> + struct iommu_hwpt_intel_vtd *s1_cfg = &domain->s1_cfg;
> + pgd_t *s1_gpgd = (pgd_t *)(uintptr_t)domain->s1_pgtbl;
> + struct dmar_domain *s2_domain = domain->s2_domain;
> + u16 did = domain_id_iommu(domain, iommu);
> + struct dma_pte *pgd = s2_domain->pgd;
> + struct pasid_entry *pte;
> + int agaw;
> +
> + if (!ecap_nest(iommu->ecap)) {
> + pr_err_ratelimited("%s: No nested translation support\n",
> + iommu->name);
> + return -ENODEV;
> + }
> +
> + /*
> + * Sanity checking performed by caller to make sure address width

"by caller"? it's checked in this function.

> + * matching in two dimensions: CPU vs. IOMMU, guest vs. host.
> + */
> + switch (s1_cfg->addr_width) {
> + case ADDR_WIDTH_4LEVEL:
> + break;
> +#ifdef CONFIG_X86
> + case ADDR_WIDTH_5LEVEL:
> + if (!cpu_feature_enabled(X86_FEATURE_LA57) ||
> + !cap_fl5lp_support(iommu->cap)) {
> + dev_err_ratelimited(dev,
> + "5-level paging not supported\n");
> + return -EINVAL;
> + }
> + break;
> +#endif
> + default:
> + dev_err_ratelimited(dev, "Invalid guest address width %d\n",
> + s1_cfg->addr_width);
> + return -EINVAL;
> + }
> +
> + if ((s1_cfg->flags & IOMMU_VTD_PGTBL_SRE) && !ecap_srs(iommu-
> >ecap)) {
> + pr_err_ratelimited("No supervisor request support on %s\n",
> + iommu->name);
> + return -EINVAL;
> + }
> +
> + if ((s1_cfg->flags & IOMMU_VTD_PGTBL_EAFE)
> && !ecap_eafs(iommu->ecap)) {
> + pr_err_ratelimited("No extended access flag support
> on %s\n",
> + iommu->name);
> + return -EINVAL;
> + }
> +
> + /*
> + * Memory type is only applicable to devices inside processor
> coherent
> + * domain. Will add MTS support once coherent devices are available.
> + */
> + if (s1_cfg->flags & IOMMU_VTD_PGTBL_MTS_MASK) {
> + pr_warn_ratelimited("No memory type support %s\n",
> + iommu->name);
> + return -EINVAL;
> + }

If it's unsupported why exposing them in the uAPI at this point?

> +
> + agaw = iommu_skip_agaw(s2_domain, iommu, &pgd);
> + if (agaw < 0) {
> + dev_err_ratelimited(dev, "Invalid domain page table\n");
> + return -EINVAL;
> + }

this looks problematic.

static inline int iommu_skip_agaw(struct dmar_domain *domain,
struct intel_iommu *iommu,
struct dma_pte **pgd)
{
int agaw;

for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
*pgd = phys_to_virt(dma_pte_addr(*pgd));
if (!dma_pte_present(*pgd))
return -EINVAL;
}

return agaw;
}

why is it safe to change pgd level of s2 domain when it's used as
the parent? this s2 pgtbl might be used by other devices behind
other iommus which already maps GPAs in a level which this
iommu doesn't support...

shouldn't we simply fail it as another incompatible condition?

> +
> + /* First level PGD (in GPA) must be supported by the second level. */
> + if ((uintptr_t)s1_gpgd > (1ULL << s2_domain->gaw)) {
> + dev_err_ratelimited(dev,
> + "Guest PGD %lx not supported,
> max %llx\n",
> + (uintptr_t)s1_gpgd, s2_domain-
> >max_addr);
> + return -EINVAL;
> + }

I'm not sure how useful this check is. Even if the pgd is sane the
lower level PTEs could include unsupported GPA's. If a guest really
doesn't want to follow the GPA restriction which vIOMMU reports,
it can easily cause IOMMU fault in many ways.

Then why treating pgd specially?