Re: [PATCH v1] ntb_transport: replace wmb() with dma_wmb() in TX paths
From: xingbang liu
Date: Sun Apr 26 2026 - 22:00:40 EST
Hi DJ ,
Thanks for the review and the explanation.
I understand that dma_wmb() is intended for CPU → device DMA visibility, and
since this path writes to MMIO (peer memory window), a full wmb() is required
to guarantee that all writes are visible before the notification/doorbell.
I will withdraw this patch and keep the wmb() in place.
Thanks,
Xingbang Liu
Dave Jiang <dave.jiang@xxxxxxxxx> 于2026年4月24日周五 23:32写道:
>
>
>
> On 4/24/26 2:23 AM, Xingbang Liu wrote:
> > The TX paths currently use wmb() after memcpy_toio() to ensure that
> > writes to the peer memory window are visible before subsequent
> > notification.
> >
> > The actual ordering requirement is to ensure that data written to the
> > peer memory is observed before the associated notification or status
> > update. This matches the semantics of dma_wmb(), which orders memory
> > writes with respect to external observers.
> >
> > The peer memory window is mapped with ioremap_wc(), allowing
> > write-combining. On weakly ordered architectures such as arm64,
> > dma_wmb() ensures that prior writes are committed from write-combining
> > buffers and become visible to the peer before subsequent operations.
> >
> > This aligns with common patterns in PCIe-based drivers, where it is
> > used to order descriptor or payload writes before notification, while
> > avoiding the stronger ordering semantics of wmb().
> >
> > The ordering with respect to peer observation is completed by the
> > subsequent notification mechanism.
> >
> > No functional change is intended.
> >
> > Signed-off-by: Xingbang Liu <liu.airalert@xxxxxxxxx>
>
> NAK. dma_wmb() is used for shared memory between CPU and a device. For example:
>
> memcpy(dma_buffer, data, size);
> dma_wmb(); /* Ensure CPU writes visible to device DMA */
> writel(DOORBELL, device->reg + DB_OFFSET); /* trigger device DMA from dma_buffer */
>
> In this case we are copying data to MMIO and therefore wmb() is needed to ensure all MMIO writes have happened before we proceed.
>
> DJ
>
> > ---
> > drivers/ntb/ntb_transport.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
> > index 7cabc82305d6..c8ef46ddcc57 100644
> > --- a/drivers/ntb/ntb_transport.c
> > +++ b/drivers/ntb/ntb_transport.c
> > @@ -1791,7 +1791,7 @@ static void ntb_memcpy_tx_on_stack(struct ntb_queue_entry *entry, void __iomem *
> > #endif
> >
> > /* Ensure that the data is fully copied out before setting the flags */
> > - wmb();
> > + dma_wmb();
> >
> > ntb_tx_copy_callback(entry, NULL);
> > }
>