Re: [PATCH 4/4] iommu/arm-smmu-v3: Remove cmpxchg() in arm_smmu_cmdq_issue_cmdlist()

From: Will Deacon
Date: Thu Jul 16 2020 - 06:20:44 EST


On Tue, Jun 23, 2020 at 01:28:40AM +0800, John Garry wrote:
> It has been shown that the cmpxchg() for finding space in the cmdq can
> be a bottleneck:
> - for more CPUs contending the cmdq, the cmpxchg() will fail more often
> - since the software-maintained cons pointer is updated on the same 64b
> memory region, the chance of cmpxchg() failure increases again
>
> The cmpxchg() is removed as part of 2 related changes:
>
> - Update prod and cmdq owner in a single atomic add operation. For this, we
> count the prod and owner in separate regions in prod memory.
>
> As with simple binary counting, once the prod+wrap fields overflow, they
> will zero. They should never overflow into "owner" region, and we zero
> the non-owner, prod region for each owner. This maintains the prod
> pointer.
>
> As for the "owner", we now count this value, instead of setting a flag.
> Similar to before, once the owner has finished gathering, it will clear
> a mask. As such, a CPU declares itself as the "owner" when it reads zero
> for this region. This zeroing will also clear possible overflow in
> wrap+prod region, above.
>
> The owner is now responsible for all cmdq locking to avoid possible
> deadlock. The owner will lock the cmdq for all non-owers it has gathered
> when they have space in the queue and have written their entries.
>
> - Check for space in the cmdq after the prod pointer has been assigned.
>
> We don't bother checking for space in the cmdq before assigning the prod
> pointer, as this would be racy.
>
> So since the prod pointer is updated unconditionally, it would be common
> for no space to be available in the cmdq when prod is assigned - that
> is, according the software-maintained prod and cons pointer. So now
> it must be ensured that the entries are not yet written and not until
> there is space.
>
> How the prod pointer is maintained also leads to a strange condition
> where the prod pointer can wrap past the cons pointer. We can detect this
> condition, and report no space here. However, a prod pointer progressed
> twice past the cons pointer cannot be detected. But it can be ensured that
> this that this scenario does not occur, as we limit the amount of
> commands any CPU can issue at any given time, such that we cannot
> progress prod pointer further.
>
> Signed-off-by: John Garry <john.garry@xxxxxxxxxx>
> ---
> drivers/iommu/arm-smmu-v3.c | 101 ++++++++++++++++++++++--------------
> 1 file changed, 61 insertions(+), 40 deletions(-)

I must admit, you made me smile putting trivial@xxxxxxxxxx on cc for this ;)

Will