[PATCH v3] locking/memory-barriers.txt: Improve documentation for writel() example

The cited commit describes that when using writel(), explcit wmb()
is not needed. wmb() is an expensive barrier. writel() uses the needed
platform specific barrier instead of expensive wmb().

Hence update the example to be more accurate that matches the current

commit 5846581e3563 ("locking/memory-barriers.txt: Fix broken DMA vs. MMIO ordering example")

Signed-off-by: Parav Pandit <parav@xxxxxxxxxx>

before we read the data from the descriptor, and the dma_wmb() allows
us to guarantee the data is written to the descriptor before the device
can see it now has ownership. The dma_mb() implies both a dma_rmb() and
- a dma_wmb(). Note that, when using writel(), a prior wmb() is not needed
+ a dma_wmb(). Note that, when using writel(), a prior barrier is not needed
to guarantee that the cache coherent memory writes have completed before
writing to the MMIO region. The cheaper writel_relaxed() does not provide
- this guarantee and must not be used here.
+ this guarantee and must not be used here. Hence, writeX() is always
+ preferred.

See the subsection "Kernel I/O barrier effects" for more information on
relaxed I/O accessors and the Documentation/core-api/dma-api.rst file for