RE: [PATCH] locking/memory-barriers.txt: Improve documentation for writel() usage

From: Parav Pandit
Date: Thu Sep 15 2022 - 12:35:29 EST


> From: Arnd Bergmann <arnd@xxxxxxxx>
> Sent: Thursday, September 15, 2022 11:16 AM
>
> On Thu, Sep 15, 2022, at 4:18 PM, Parav Pandit wrote:
> >> From: Arnd Bergmann <arnd@xxxxxxxx>
> >> Sent: Thursday, September 15, 2022 8:38 AM
> >>
> >> On Thu, Sep 15, 2022, at 7:01 AM, Parav Pandit wrote:
> >> > The cited commit [1] describes that when using writel(), explcit
> >> > wmb() is not needed. However, it should have said that dma_wmb() is
> >> > not needed.
> >>
> >> Are you sure? As I understand it, the dma_wmb() only serializes a set
> >> of memory accesses, but does not serialized against an MMIO access,
> >> which depending on the CPU architecture may require a different type of
> barrier.
> >>
> >> E.g. on arm, writel() uses __iowmb(), which like wmb() is defined as
> >> "dsb(x); arm_heavy_mb();", while dma_wmb() is a "dmb(oshst)".
> >
> > You are right, on arm heavy barrier dsb() is needed, while on arm64,
> > dmb(oshst) is sufficient.
> >
> > So more accurate documentation is to say that 'when using writel() a
> > prior IO barrier is not needed ...'
> >
> > How about that?
>
> That's probably fine, not sure if it's worth changing.
>
I think it is worth because current documentation, indirectly (or incorrectly) indicate that
"writel() does wmb() internally, so those drivers, who has difficulty in using writel() can do, wmb() + raw write".
And I sort of see above pattern in two drivers, and it is not good.
It ends up doing dsb(st) on arm64, while needed barrier is only dmb(oshst).

So to fix those two drivers, it is better to first avoid wmb() documentation reference when referring to writel().

> > It started with my cleanup efforts to two drivers [1] and [2] that had
> > difficulty in using writel() on 32-bit system, and it ended up open
> > coding writel() as wmb() + mlx5_write64().
> >
> > I am cleaning up the repetitive pattern of, wmb();
> > mlx5_write64()
> >
> > Before I fix drivers, I thought to improve the documentation that I
> > can follow. :)
>
> Right, that is definitely a good idea.
>
> I see that there is more going on with that function, at least the loop in
> post_send_nop() probably just wants to use __iowrite64_copy(), but that
> also has no barrier in it, while changing mlx5_write64() to use iowrite64be()
> or similar would of course add excessive barriers inside of the loop.

True. All other conversion seems possible.
For post_send_nop(), __iowmb() needs to be exposed, which is not available today and it is only one-off user,
I am inclined to keep post_send_nop() as-is, but want to improve/correct rest of the callers in these two drivers.