RE: [PATCH v3 18/18] infiniband: cxgb4: Eliminate duplicate barriers on weakly-ordered archs
From: Steve Wise
Date: Fri Mar 16 2018 - 19:04:46 EST
>
> On Fri, Mar 16, 2018 at 04:05:10PM -0500, Steve Wise wrote:
> > > Code includes wmb() followed by writel(). writel() already has a
barrier
> > on
> > > some architectures like arm64.
> > >
> > > This ends up CPU observing two barriers back to back before executing
> the
> > > register write.
> > >
> > > Since code already has an explicit barrier call, changing writel() to
> > > writel_relaxed().
> > >
> > > Signed-off-by: Sinan Kaya <okaya@xxxxxxxxxxxxxx>
> >
> > NAK - This isn't correct for PowerPC. For PowerPC, writeX_relaxed() is
just
> > writeX().
>
> ?? Why is changing writex() to writeX() a NAK then?
Because I want it correct for PPC as well.
>
> > I was just looking at this with Chelsio developers, and they said the
> > writeX() should be replaced with __raw_writeX(), not writeX_relaxed(),
to
> > get rid of the extra barrier for all architectures.
>
> That doesn't seem semanticaly sane.
>
> __raw_writeX() should not appear in driver code, IMHO. Only the arch
> code can know what the exact semantics of that accessor are..
>
> If ppc can't use writel_relaxed to optimize then we probably need yet
> another io accessor semantic defined :(
Anybody understand why the PPC implementation of writeX_relaxed() isn't
relaxed?
Steve.