Re: [PATCH] iommu/arm-smmu-v3: Fix incorrect description of DMB instruction

From: Robin Murphy
Date: Mon Dec 12 2022 - 08:58:44 EST


On 2022-12-09 13:53, Harry Song wrote:
The current comment is a description of the DSB instruction:
previous commit [1].

In the ARM architecture manual, DSB and DMB instructions
are described as follows:

DMB: The DMB instruction does not ensure the completion of
any of the memory accesses for which it ensures relative order.i

DSB: A DSB instruction is a memory barrier that ensures that
memory accesses that occur before the DSB instruction have
completed before the completion of the DSB instruction.

So after dsb is replaced with dmb, the description here is not correct.
DMB instructions do not ensure that memory access has been completed,
but rather ensure the order of instructions.

Except a memory access and MMIO write are to different destinations, so it's likely that in practice the only way the CPU/interconnect can enforce that ordering and guarantee that the memory access will be observed by a peripheral as soon as the MMIO write completes is to wait for the memory access to complete before allowing the MMIO write to proceed beyond the point at which they diverge.

Furthermore, "update the cons pointer" is not a defined architectural term anyway, and certainly to me it rather implies the entire process of the SMMU receiving the value and updating its internal queue state, which necessarily implies completion of the MMIO write and thus completion of everything ordered before it. If the comment had been overly-specific and said "before executing the following store to the cons register", then it might be deserving of getting into this level of detail, but as it is, meh. Even then it's not like this is architecture code using an explicit dmb(); ultimately it's a Linux driver describing an overall behaviour it expects from a Linux barrier operation in terms of the Linux memory model.

Frankly the much bigger complaint I have with this comment is that although it does technically explain the purpose of having some barrier here, it does a pretty terrible job of clarifying *why* it's open-coded and can't be achieved with standard Linux memory model primitives, particularly since __iomb() is undocumented (other than "arm64-specific") and appears nowhere else in the entire kernel.

Thanks,
Robin.

a76a37777f2c ("iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer") [1]

Signed-off-by: Harry Song <jundongsong1@xxxxxxxxx>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6d5df91c5..fb229c0ab 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -139,8 +139,8 @@ static bool queue_consumed(struct arm_smmu_ll_queue *q, u32 prod)
static void queue_sync_cons_out(struct arm_smmu_queue *q)
{
/*
- * Ensure that all CPU accesses (reads and writes) to the queue
- * are complete before we update the cons pointer.
+ * Ensure the relative order of cpu accesses (reads and writes)
+ * to the queue before update the cons pointer.
*/
__iomb();
writel_relaxed(q->llq.cons, q->cons_reg);