Re: [PATCH v10 7/7] PCI: endpoint: pci-ep-msi: Add embedded doorbell fallback
From: Robin Murphy
Date: Thu Mar 26 2026 - 08:15:07 EST
On 2026-03-26 10:25 am, Niklas Cassel wrote:
On Thu, Mar 26, 2026 at 10:59:26AM +0100, Niklas Cassel wrote:
On Thu, Mar 26, 2026 at 05:49:13PM +0900, Koichiro Den wrote:
The transaction is a write from
PCIe bus -> PCIe controller iATU -> internal bus -> IOMMU -> PCIe controller
(the same controller as initiated the transaction).
Yes, I think we're on the same page about this path itself.
On my R-Car S4 setup, changing this to DMA_TO_DEVICE consistently triggers an
IOMMU fault, so at least on this platform the local path used for the doorbell
mapping is IOMMU-visible. That is the case this dma_map_resource() is intended
to cover.
For that path, my understanding is that the doorbell access ends up as a local
write on the EP side, so it needs write permission, hence DMA_FROM_DEVICE.
Would be interesting why this is not working like normal (when using buffers):
"For Networking drivers, it’s a rather simple affair.
For transmit packets, map/unmap them with the DMA_TO_DEVICE direction specifier.
For receive packets, just the opposite, map/unmap them with the DMA_FROM_DEVICE
direction specifier."
I think the closer analogy is RX: the data comes from outside, but the device
writes to the target, so it needs write permission.
I think it is from the PoV from the IOMMU, is the transaction a Read by a device
or a Write by a device?
For a NIC driver:
For a RX packet, the data is coming from the device to the memory.
device is doing a transaction to memory.
For a TX packet, the data is going from the memory to the device.
In our case, the data is coming from the device, to a device.
Almost like a P2P DMA, but in our case, both devices are the same, so
using the P2P DMA API like pci_p2pdma_add_resource() seems unnecessary.
So should it be DMA_BIDIRECTIONAL ? :)
I understand that for the R-Car S4 Spider IOMMU, it is sufficient to map
it as DMA_FROM_DEVICE. I just want to be sure that on some other IOMMU,
they might consider it sufficient to map this as DMA_TO_DEVICE (because
it is also a transaction going to a device).
I just want to make sure that the code works on more than one IOMMU.
Perhaps some IOMMU experts could help chime in.
Note that I am happy to merge the code as is, as it obviously works on the
only platform that this has been tested on (R-Car S4 Spider), and if other
platform tries to run this test case, if their IOMMU works differently, it
will scream and they will report it to the list. So all good.
I'm mostly want to know how the DMA-API is supposed to be used in this
specific scenario (device doing a write transaction to the same device).
I guess if a device will be reading or writing from this IOVA that is
created by IOMMU by the dma_map() call...
The device will only be writing to this IOVA.
The device will never be reading from the IOVA (since the physical address
is a register in the device itself, we will never supply this IOVA for the
device to read from).
DMA_FROM_DEVICE seems correct in all cases. DMA_BIDIRECTIONAL seems wrong
since the device will never read from this IOVA. Sorry for the noise.
DMA_FROM_DEVICE means the device will be sending write requests to the address; DMA_TO_DEVICE means the device will be sending read requests to the address; DMA_BIDIRECTIONAL means the device might send either or both. Never does the *address* try to pull or push data anywhere of its own accord, regardless of whether it's RAM or MMIO or whatever. Hence why we don't have or need DMA_{TO,FROM}_MEMORY either ;)
Indeed if the endpoint controller could somehow recognise that an incoming write was a "doorbell" write based on the incoming BAR offset and internally short-circuit that to directly raise its own interrupt *without* propagating a transaction into the rest of the system, then it wouldn't need the DMA mapping. However, if it's just set up to forward ranges of BAR address to ranges of system addresses, and there's an IOMMU in that path (such that the iATU's "system addresses" at that point are actually IOVAs), then all that really matters is what requests the IOMMU is going to see coming out of the iATU, irrespective of whether a particular system address after translation happens to be an MSI controller, and that MSI controller coincidentally happens to live right next to the iATU itself.
(Of course the DMA API itself does also care about the general distinction of the target address being "memory" or "not memory", as implied by dma_map_page/single/sg() vs. dma_map_resource(), but that's mostly just for the software aspects of knowing not to attempt cache maintenance etc. on the "not memory" where it wouldn't be meaningful or valid.)
Thanks,
Robin.
Kind regards,
Niklas