Re: [RFC PATCH] iommu/amd: fix a race in increase_address_space()

From: Qian Cai
Date: Thu Sep 05 2019 - 16:57:26 EST


On Thu, 2019-09-05 at 13:43 +0200, Joerg Roedel wrote:
> Hi Qian,
>
> On Wed, Sep 04, 2019 at 05:24:22PM -0400, Qian Cai wrote:
> > if (domain->mode == PAGE_MODE_6_LEVEL)
> > /* address space already 64 bit large */
> > return false;
> >
> > This gives a clue that there must be a race between multiple concurrent
> > threads in increase_address_space().
>
> Thanks for tracking this down, there is a race indeed.
>
> > + mutex_lock(&domain->api_lock);
> > *dma_addr = __map_single(dev, dma_dom, page_to_phys(page),
> > size, DMA_BIDIRECTIONAL, dma_mask);
> > + mutex_unlock(&domain->api_lock);
> >
> > if (*dma_addr == DMA_MAPPING_ERROR)
> > goto out_free;
> > @@ -2696,7 +2698,9 @@ static void free_coherent(struct device *dev, size_t size,
> >
> > dma_dom = to_dma_ops_domain(domain);
> >
> > + mutex_lock(&domain->api_lock);
> > __unmap_single(dma_dom, dma_addr, size, DMA_BIDIRECTIONAL);
> > + mutex_unlock(&domain->api_lock);
>
> But I think the right fix is to lock the operation in
> increase_address_space() directly, and not the calls around it, like in
> the diff below. It is untested, so can you please try it and report back
> if it fixes your issue?

Yes, it works great so far.

>
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index b607a92791d3..1ff705f16239 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -1424,18 +1424,21 @@ static void free_pagetable(struct protection_domain *domain)
> * another level increases the size of the address space by 9 bits to a size up
> * to 64 bits.
> */
> -static bool increase_address_space(struct protection_domain *domain,
> +static void increase_address_space(struct protection_domain *domain,
> gfp_t gfp)
> {
> + unsigned long flags;
> u64 *pte;
>
> - if (domain->mode == PAGE_MODE_6_LEVEL)
> + spin_lock_irqsave(&domain->lock, flags);
> +
> + if (WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL))
> /* address space already 64 bit large */
> - return false;
> + goto out;
>
> pte = (void *)get_zeroed_page(gfp);
> if (!pte)
> - return false;
> + goto out;
>
> *pte = PM_LEVEL_PDE(domain->mode,
> iommu_virt_to_phys(domain->pt_root));
> @@ -1443,7 +1446,10 @@ static bool increase_address_space(struct protection_domain *domain,
> domain->mode += 1;
> domain->updated = true;
>
> - return true;
> +out:
> + spin_unlock_irqrestore(&domain->lock, flags);
> +
> + return;
> }
>
> static u64 *alloc_pte(struct protection_domain *domain,