Re: [PATCH v4 3/7] iommu/mediatek: Use gather to achieve the tlb range flush

From: Will Deacon
Date: Wed Oct 23 2019 - 12:55:52 EST


On Wed, Oct 16, 2019 at 11:33:08AM +0800, Yong Wu wrote:
> Use the iommu_gather mechanism to achieve the tlb range flush.
> Gather the iova range in the "tlb_add_page", then flush the merged iova
> range in iotlb_sync.
>
> Suggested-by: Tomasz Figa <tfiga@xxxxxxxxxxxx>
> Signed-off-by: Yong Wu <yong.wu@xxxxxxxxxxxx>
> ---
> drivers/iommu/mtk_iommu.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index c2f6c78..81ac95f 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -245,11 +245,9 @@ static void mtk_iommu_tlb_flush_page_nosync(struct iommu_iotlb_gather *gather,
> void *cookie)
> {
> struct mtk_iommu_data *data = cookie;
> - unsigned long flags;
> + struct iommu_domain *domain = &data->m4u_dom->domain;
>
> - spin_lock_irqsave(&data->tlb_lock, flags);
> - mtk_iommu_tlb_add_flush_nosync(iova, granule, granule, true, cookie);
> - spin_unlock_irqrestore(&data->tlb_lock, flags);
> + iommu_iotlb_gather_add_page(domain, gather, iova, granule);

You need to be careful here, because iommu_iotlb_gather_add_page() can
call iommu_tlb_sync() in some situations and you don't hold the lock.

> static const struct iommu_flush_ops mtk_iommu_flush_ops = {
> @@ -469,9 +467,15 @@ static void mtk_iommu_iotlb_sync(struct iommu_domain *domain,
> struct iommu_iotlb_gather *gather)
> {
> struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> + size_t length = gather->end - gather->start;
> unsigned long flags;
>
> + if (gather->start == ULONG_MAX)
> + return;
> +
> spin_lock_irqsave(&data->tlb_lock, flags);
> + mtk_iommu_tlb_add_flush_nosync(gather->start, length, gather->pgsize,
> + false, data);
> mtk_iommu_tlb_sync(data);
> spin_unlock_irqrestore(&data->tlb_lock, flags);

Modulo my comment above, this fixes my previous comment. Given that mainline
is already broken, I guess the runtime bisectability isn't a problem.

Will