Re: [PATCH v2 7/7] iommu/vt-d: Consolidate domain_init() to avoid duplication

From: Alex Williamson
Date: Thu Jul 18 2019 - 19:16:19 EST


On Wed, 12 Jun 2019 08:28:51 +0800
Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:

> The domain_init() and md_domain_init() do almost the same job.
> Consolidate them to avoid duplication.
>
> Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> ---
> drivers/iommu/intel-iommu.c | 123 +++++++++++-------------------------
> 1 file changed, 36 insertions(+), 87 deletions(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 5215dcd535a1..b8c6cf1d5f90 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -1825,63 +1825,6 @@ static inline int guestwidth_to_adjustwidth(int gaw)
> return agaw;
> }
>
> -static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
> - int guest_width)
> -{
> - int adjust_width, agaw;
> - unsigned long sagaw;
> - int err;
> -
> - init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
> -
> - err = init_iova_flush_queue(&domain->iovad,
> - iommu_flush_iova, iova_entry_free);
> - if (err)
> - return err;
> -
> - domain_reserve_special_ranges(domain);
> -
> - /* calculate AGAW */
> - if (guest_width > cap_mgaw(iommu->cap))
> - guest_width = cap_mgaw(iommu->cap);
> - domain->gaw = guest_width;
> - adjust_width = guestwidth_to_adjustwidth(guest_width);
> - agaw = width_to_agaw(adjust_width);
> - sagaw = cap_sagaw(iommu->cap);
> - if (!test_bit(agaw, &sagaw)) {
> - /* hardware doesn't support it, choose a bigger one */
> - pr_debug("Hardware doesn't support agaw %d\n", agaw);
> - agaw = find_next_bit(&sagaw, 5, agaw);
> - if (agaw >= 5)
> - return -ENODEV;
> - }
> - domain->agaw = agaw;
> -
> - if (ecap_coherent(iommu->ecap))
> - domain->iommu_coherency = 1;
> - else
> - domain->iommu_coherency = 0;
> -
> - if (ecap_sc_support(iommu->ecap))
> - domain->iommu_snooping = 1;
> - else
> - domain->iommu_snooping = 0;
> -
> - if (intel_iommu_superpage)
> - domain->iommu_superpage = fls(cap_super_page_val(iommu->cap));
> - else
> - domain->iommu_superpage = 0;
> -
> - domain->nid = iommu->node;
> -
> - /* always allocate the top pgd */
> - domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
> - if (!domain->pgd)
> - return -ENOMEM;
> - __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
> - return 0;
> -}
> -
> static void domain_exit(struct dmar_domain *domain)
> {
> struct page *freelist;
> @@ -2563,6 +2506,31 @@ static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
> return 0;
> }
>
> +static int domain_init(struct dmar_domain *domain, int guest_width)
> +{
> + int adjust_width;
> +
> + init_iova_domain(&domain->iovad, VTD_PAGE_SIZE, IOVA_START_PFN);
> + domain_reserve_special_ranges(domain);
> +
> + /* calculate AGAW */
> + domain->gaw = guest_width;
> + adjust_width = guestwidth_to_adjustwidth(guest_width);
> + domain->agaw = width_to_agaw(adjust_width);


How do we justify that domain->agaw is nothing like it was previously
here? I spent some more time working on the failure to boot that I
thought was caused by 4ec066c7b147, but there are so many breakages and
fixes in Joerg's x86/vt-d branch that I think my bisect zero'd in on
the wrong one. Instead I cherry-picked every commit from Joerg's tree
and matched Fixes patches to their original commit, which led me to
this patch, mainline commit 123b2ffc376e. The issue I'm seeing is that
we call domain_context_mapping_one() and we are in this section:

struct dma_pte *pgd = domain->pgd;
int agaw;

context_set_domain_id(context, did);

if (translation != CONTEXT_TT_PASS_THROUGH) {
/*
* Skip top levels of page tables for iommu which has
* less agaw than default. Unnecessary for PT mode.
*/
for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
ret = -ENOMEM;
pgd = phys_to_virt(dma_pte_addr(pgd));
if (!dma_pte_present(pgd))
goto out_unlock;
}

Prior to this commit, we had domain->agaw=1 and iommu->agaw=1, so we
don't enter the loop. With this commit, we have domain->agaw=3,
iommu->agaw=1 with pgd->val=0!

I don't really follow how the setting of these fields above is
equivalent to what they were previously or if they're supposed to be
updated lazily, but the current behavior is non-functional. Commit
123b2ffc376e can be reverted with only a bit of offset, which brings
Linus' tree back into working operation for me. Should we revert or is
there an obvious fix here? Thanks,

Alex