Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()

From: Joerg Roedel
Date: Wed Apr 08 2020 - 10:19:19 EST


Hi Qian,

On Tue, Apr 07, 2020 at 11:36:05AM -0400, Qian Cai wrote:
> After further testing, the change along is insufficient. What I am chasing right
> now is the swap device will go offline after heavy memory pressure below. The
> symptom is similar to what we have in the commit,
>
> 754265bcab78 (âiommu/amd: Fix race in increase_address_space()â)
>
> Apparently, it is no possible to take the domain->lock in fetch_pte() because it
> could sleep.

Thanks a lot for finding and tracking down another race in the AMD IOMMU
page-table code. The domain->lock is a spin-lock and taking it can't
sleep. But fetch_pte() is a fast-path and must not take any locks.

I think the best fix is to update the pt_root and mode of the domain
atomically by storing the mode in the lower 12 bits of pt_root. This way
they are stored together and can be read/write atomically.

Regards,

Joerg