Re: [PATCH v3 18/18] infiniband: cxgb4: Eliminate duplicate barriers on weakly-ordered archs

From: Jason Gunthorpe
Date: Fri Mar 16 2018 - 18:14:09 EST


On Fri, Mar 16, 2018 at 04:05:10PM -0500, Steve Wise wrote:
> > Code includes wmb() followed by writel(). writel() already has a barrier
> on
> > some architectures like arm64.
> >
> > This ends up CPU observing two barriers back to back before executing the
> > register write.
> >
> > Since code already has an explicit barrier call, changing writel() to
> > writel_relaxed().
> >
> > Signed-off-by: Sinan Kaya <okaya@xxxxxxxxxxxxxx>
>
> NAK - This isn't correct for PowerPC. For PowerPC, writeX_relaxed() is just
> writeX().

?? Why is changing writex() to writeX() a NAK then?

> I was just looking at this with Chelsio developers, and they said the
> writeX() should be replaced with __raw_writeX(), not writeX_relaxed(), to
> get rid of the extra barrier for all architectures.

That doesn't seem semanticaly sane.

__raw_writeX() should not appear in driver code, IMHO. Only the arch
code can know what the exact semantics of that accessor are..

If ppc can't use writel_relaxed to optimize then we probably need yet
another io accessor semantic defined :(

Jason